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ABSTRACT 


The sequential solution, in recursive form, of a 
growing set of linear equations, based upon the least- 
Square-error and a weighted least-Square-error criterion, 
are developed. For comparison these results are applied 
to the discrete-time solution of several estimation and 
identification problems. Recursive algorithms for pseudo 
inversion and the best approximate solution of a set of 
linear equations are included. Finally, efficient state 
estimation procedures for time-invariant systems using a 


sliding-window observer are presented. 
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Te INSRODUCTION 


In a broad sense this dissertation 1s concerned wren 


the basic problem of solving the linear matrix equation* 


Ax = b (1. 


where A is an mxn matrix, X an n x 1 vector of unknowns, 
Pace bedmem x Meyector Cf veenscamts-— thie solution, if A has 
rank r = n, is straightforward and reduces essentially to 
matrix inversion 

-1 


& = [AA] Ab (aaa) 


where Ae 1s the transpose of A, tata] 1s the inverse of 
Am, and where x denotes the solution of (1.1). However, 
when the rank of A is less than n, the set of equations 
(1.1) is underdetermined and infinitely many solutions 
exist. In order to select a unigue solution out of all 
possible solutions further constraints are imposed. In 
the work presented here the minimum-norm solution, as de- 


fined by Penrose [5], is selected. If x denotes the 


selected solution, and x, any other possible solution, then 


HEI < Wxol Cis) 


where ||x|| = trace xx’. The solution of (1.1) is further 


: , 

Throughout this dissertation a bar under a lower case 
letter represents a column matrix or vector. Capital 
letters generally refer to matrices. 


complicated when the set of equations is inconsistent so 
that there is no solution which satisfies all equations. 
In this case a compromise solution has to be accepted 

such that all equations are satisfied as close as possible 
according to some criterion. Usually this compromise 


solution is selected by minimizing an error criterion J. 
J = £f{e) (1.4) 


where @ = )Ax,- b.,The most, commonly, used criterion 1S ste 


least-square-error sum 


— e“e = Eeace ee! a) e? (lS 


where ey is the yuh element of the vector e. 


When this criterionis combined with the minimum-norm con- 
dition (1.3) a unique general solution results which 
Penrose [5] has defined as the best approximate solution and 
which is obtained using the concept of the pseudo inverse. 
This general approach to the solution of (1.1) is par- 
ticularly useful when applied to discrete-time system 
problems such as state estimation, parameter identification, 
and the limited memory observer problem as discussed in the 
body of this dissertation. As an example, consider the 
problem of estimating the states of a linear, dynamic system 
from noise-contaminated measurements. The dynamic system 


and the measurements are given by 


(1.6a) 


( LG») 


where xy is the system state vector, % rT the discrete, 

= ’ 
time-varying transition matrix, M, a time-varying observation 
Matrix Of dimensions I xn, Vy the measurement noise, and Zy 


a scalar observation. After k observations the data may be 


arranged as follows 


ial | 

74 earls 

| 

IZ M.o 

M2) 2 [202k x, (1.7) 


Thus the state estimation problem for linear, discrete-time 
systems is reduced to the problem of solving (1.1). When 
the transition matrix is the identity matrix this problem 
Beamces tO estimating a constant but unknown vector x. “Tite 
estimation problem has been considered for many years by 
such famous mathematicians as Gauss [1], Penrose [4,5], 
Kalman [2], Deutsch [1] and many others. In spite of their 
differing approaches the underlying concept remains the 
SemrucLonm of (1.1). 

The problem of identifying the coefficients of the 
recursive difference equation describing a time-invariant, 
linear system from a sequence of noisy response measure- 
ments can also be reduced to the solution of (1.1). Pre- 


vious investigators have solved this identification problem 


using different methods such as correlation functions, 
deconvolution techniques, adaptive model techniques, and 
others as discussed by Mishkin and Braun [17], Eveleigh 
[18], Eykkoff [19] and many others. The approach using 
the problem formulation of (1.1) has been considered by 
R. C. K. Lee Tea 

Thus, it has been well established that it is possible 
to solve discrete estimation or identification problems 
using the concept of the best approximate solution of (1.1). 
For real-time computation, as required for example in some 
self-adaptive control systems, it is desired to obtain 
numerical results sequentially with a minimum of computation 
time and data storage. Since the dimension, m, of matrix A 
grows at each step in time as additional data is acquired, 
it is desirable to formulate a sequential solution to (1.1) 
in recursive form. Almost all known algorithms are based 
upon the assumption that matrix A has rank r = n whenever 
m > n, which might not be true in general. The algorithm 
developed in this dissertation solves (1.1) sequentially 
for the best approximate solution [5] without any assump- 
tions as to the rank of A. The normalized least-square- 
error solution and an algorithm, an alternate formulation 
based upon a different error criterion, are developed and 
applied to an estimation and an identification problem. 
For completeness, recursive non-sequential forms for 
obtaining the Penrose inverse and the best approximate 


solution of (1.1), when the matrix A has constant dimensions, 


are included. Finally, efficient procedures for sequential 
state estimation of time-invariant systems are presented, 
where the state estimation is obtained from a finite but 
continuously updated set of observations (sliding-window 
Observer). These results fall into the category of linear 
Observer theory as discussed by Luenberger [11] and Bona 
eZ" le 

The development and discussion of the foregoing results, 
including geometric interpretation and examples, are pre- 
sented as follows. In Chapter II, Eqs. (1.1) are considered 
to be a set of overdetermined equations. The closed form 
solution, as well as the recursive solution (Kalman type 
filter), are well known. However, a geometric interpre- 
tation of the known results suggests an alternate way of 
selecting the compromise solution using a weighted least- 
square-error criterion. This solution is defined here as 
the normalized least-square-error solution. The normalized 
least-square-error solution in recursive form, which requires 
Only a slight modification of the least-square-error 
algorithm, is applied to a specific estimation problem as 
well as to an identification problem. The latter consists 
of identifying the coefficients of a difference equation 
describing a dynamic, linear, time-invariant system from 
system response data. The experimental results are quite 
favorable for the normalized least-square-error solutions. 
For the identification problem it is shown that with 


Sinusoidal excitation the normalized least-square-error 


solution results in a smaller bias error. However, these 
results cannot be generalized and whether the normalized 
least-square-error algorithm should be used depends 
entirely upon the specific problem under consideration. 

In Chapter III the general solution of (1.1) when the 
rank of A is less than or equal to n is discussed. First 
the definition and some properties of the pseudo inverse 
and the best approximate solution according to Penrose 
[4,5] are stated and some alternate expressions for the 
pseudo inverse are discussed. Then a complete recursive 
algorithm for the best approximate solution for the general 
case is developed for use as a sequential-estimation pro- 
cedure. The algorithm presented here has the advantage 
over previously published results in that the dimensions of 
the matrices involved in the algorithm remain constant 
irrespective of the dimension, m, and rank of A. It is 
interesting to note that the form of the resulting filter 


equation 


X, = aay tg (2, - ay a) (1 ay 
evolves naturally by solving the necessary equations and 
is not assumed a priori. Finally, the algorithm is adapted 
for state estimation of linear, time-varying dynamic 
systems. 
The solution of a fixed set of equations (1.1) is con- 
Sidered in Chapter IV. Although it is possible to obtain 


the solution with the algorithm of Chapter III, this method 


EO 


is not very efficient because only the final solution, with- 
out intermediate sequential estimates, 1S required. More 
efficient methods are obtained here by combining an infinite 
iterative error correcting method, as developed by Noda and 
Nagumo [10], and the Gram-Schmidt orthogonalization process 
[9]. The results are finite-step algorithms for the solu- 
tion of matrix equation (1.1), matrix inversion (when the 
matrix 1S nonsingular), and matrix pseudo inversion for the 
general case. 

In Chapter V the sequential solution of a growing set 
of equations (1.1) 1S considered again. However it is 
Mesumed a priori that the set of equations is comsiatemt 
and that any subsequent sets of n equations in (1.1) are 
independent. The problem is then solved with the ultimate 
goal of developing finite-memory, sliding-window observers. 
First, the general case of time-varying linear systems 
is considered and a new algorithm for the minimum-window 
observer (an observer with a memory limited to exactly 
the minimal set of data) is developed. However the high 
sensitivity of this observer to measurement noise severely 
limits its use [12]. Introducing the further constraints 
that the system and the observation matrix are time invariant 
leads to more useful and very efficient results. It is 
shown that with these constraints it is possible to construct 
Simple and efficient filters for state estimation from noise- 
contaminated measurements using the results for the minimum- 


window observer. Also, using the concept of the pseudo 


inverse, the memory of the observer may be extended so that 


a sliding-window observer of arbitrary length & >n is 


obtained. 


reduces 


where F 
filters 


with an 


Ee 


Beste A 


and g are constant. The performances 
when processing noisy measurements is 


example. 


In this case the algorithm for state estimation 


ell oa) 


of these 


1llustrated 


Finally, in Appendix A, the solution of a set of non- 


linear equations using the results of Chapter IV is 
attempted. An iteration scheme is presented in which the 
value of the total difference quotient for the nonlinear 
part of the set of equations 1S iterated sequentially to 
its true solution. Computational results are presented for 
examples where this iteration process converges while other 
commonly used methods, such as Newton-Raphson, Gradient, 
and Linear Interpolation fail. This iteration scheme is 
then proposed for the solution of sets of nonlinear 


differential equation for networks containing nonlinear, 


memoryless, dissipative elements. 
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Il. THE NORMALIZED LEAST-SQUARE-ERROR SOLUTION 


In this chapter the solution of a set of overdetermined 
Simultaneous linear equations is considered first, using 
the concepts of the pseudo inverse as developed byPenrose 
[4,5]. The usual least-square-error solution is presented 
and interpreted geometrically. It is then shown that an 
alternate solution, which has been designated here as the 
normalized least-square-error solution, is also possible 
and leads to a different geometrical interpretation. These 
results are then applied to an estimation problem, and a 
recursive algorithm, based upon the normalized least-square- 
error solution, is developed. The resulting equations are 
Similar in form to the Kalman [2] type of discrete filter 
as discussed by R. C. K. Lee [3], but differ substantially 
in their precise formulation and the nature of the results. 
A numerical-estimation example comparing the least-square- 
error filter with the normalized least-square-error filter 
is presented. The recursive equations are then applied 
to the problem of identifying the coefficients of the 
difference equation describing a dynamic system from a 
sequence of noisy measurements of the system's response. 
The results of a numerical example are presented and com- 
pared with the results obtained using the usual least- 
Ssquare-error filter. These results indicate that the error 


in coefficient identification may be less for the normalized 


is 


least-square-error solution, as defined, than for the usual 
least-square-error solution available in the literature. 

An alternate approach to the solution of the identifi- 
cation problem is to use estimates for the past system 
response rather than the past observations themselves in the 
problem formulation. As shown in Example 2.3 this procedure 
may result in a better identification of the system co- 
efficients. Finally, it is demonstrated that in the 
presence of measurement noise a bias error in the identifi- 
cation begins to build whenever the input function is 
constant. However, when the input function is causing a 
Significant dynamic system response, the estimation error 
approaches a constant bias. The analysis of bias error is 
performed by approximating the discrete formulations in 
the continuous time domain so that a limiting process can 
be performed readily. These results are verified experi- 
mentally. 

A. LEAST-SQUARE-ERROR AND NORMALIZED LEAST-SQUARE-ERROR 

SOLUTION 

The most common method of solving a set of m simul- 
taneous equations in n unknowns, where m > n, is the least- 
square-error solution. The problem consists of solving 


the algebraic relationships 
b = Ax (2 aan) 


where A as the m x n matrix of coefficYreénts 
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is the m x l vector of constants 


[oO 


is the n x Ll vector of unknowns. 


es 


A solution for x is the best approximate solUtuon s2i_eveduced 


by Penrose [5]. 


= A’b (2.2) 


| <> 


sp ; a 
mere A is the Penrose pseudo-inversa, amd Gis the bese 
approximate solution. If the matrix A has rank n, this 


Solution is obtained as* 


x = [A’A] Ab (2 38) 


=-] 
where a’ denotes the transpose of A and fata] the inverse 
yale Ie). In this case the solution x minimizes the cost 


mernction J 


i 2 hi 
I= lak - bl = fel? = 2 ef (2.4) 
i=l 
where e = Ax - b m.5) 
and e. represenes the components OCferme Vecger ce. “ainus 


this solution satisfies all the equations of (2.1) as close 
as possible in the least-square-error sense which is shown 


as follows. 


* 

If A has rank less than n (i.e., the rows of A are not 
independent) the pseudo inverse has a different form as 
discussed in Chapter III. 


ite 


Bae Minimum of Eq. (1.4) occurs when 


m oe 
oJ = 2[ax - bl A= 0 (2.6) 
OX 7 
or A Ax = A’b (22a 
Therefore 
an T -l| T 
x = [ATA] A’b (2.8) 


A geometric interpretation of the least-square-error 
solution may be obtained by considering the two dimensional 
case where the vector x has two components Xp1 X5- Bach 
equation of (2.1) represents a line in the Xp1 Xo plane. 
These lines generally do not intersect at a single point. 


Now consider the sequence of lines given by 


Ax =b-e (2.9a) 


pate se 1S a vector with /arbetrary elements, e.- These 
lines are parallel to the original lines but shifted in 


the orthogonal direction by the amount, Sis where 


=o 
e 


as Jue 2 2 
2g er (ai; + a5; ) = w a ees 9 em (2.9b) 
and ais are elements of the matrix A. In the least-Square- 


error solution the lines are shifted so that they all pass 


through the point Xe oe so that the cost function, J 


m 
J= 2 e (2.9c) 
1S minimizea. 
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A different result may be obtained by considering each 
equation of (2.1) as a permissible geometric locus for the 
desired solution point. In general this locus is a hyper- 
plane in n dimensional space, where n is the number of 
elements in the unknown vector x. If all the Megheemtersect 
in a single point this solution corresponds to that of 
(2.3). If they do not intersect in a point, the solution 
point may be defined as the point which minimizes the sum 
of the distances squared to each locus. This solution is 
defined here as the normalized least-square-error solution. 
Its interpretation is guite different from the least-square- 
error solution and in certain types of identification prob- 
lems is seen to be more meaningful than the usual results 
available in the literature. 

The normalized least-square-error solution is a weighted 
least-square-error solution with weighting factors chosen 
such that the solution point lies as close as possible to 
all geometric loci described by (2.1). This result can be 
obtained by selecting the solution x* such that the scalar 


cost function 
s* = 2 ad. (2.10a) 


is minimized, where the d.'s are the distaneés from x* to 
the respective loci designated by the subscript i. Before 
deriving this solution it is deSirable to prove the 


following. 
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The distance from a point 


* * * a 
P (x5 rXo™% 2ee X, ) to the plane 


0 = a,%} - a5X5 + ... + ax, 7 e (2. LO 


elise | (a5 + av +...+ a‘) (a,x,* + a,x5*+...+a,x* - c) | 


(2 Sales) 
Proof of Eq. (2.11) 1s given for the three dimensional case. 


Extension to higher dimensions is obvious. 


Proof 

Let the point P*(x*,y*,z*) and the plane c = a,xtaoyta,z be 
given. The unit vector normal to the plane is found by 
considering a plane through the origin parallel to the 


given plane. Its equation is 


O= a,x + a5Y + az (2.124) 


yi 3 


which is the dot product of two vectors, one the position 


VeGlor Ceeowaly POmmc lmethe "plane 
r= xL + vt + Z¥e. (2. 2s 
and the other a vector normal to the plane 


n= ae wang + ask (21s 


where 1, $, and & are unit vectors defining the three- 


dimensional cartesian space. The unit normal vector is then 
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a oes + a53 + 4k (2.28) 


where 


The distance |d| is then given by (See Fig. 2.1, where N is 
the point in the given plane closest to the given point P* 


ame NP@ denotes the vector from @ tomes) 
fale ege*) = t))| . (eis 


Bere mis the position vector of a point in tie Jgivememme and 


meeisetche vector from the Onigin to the poimt P=. 


Since 9 5 —% 

as a,x + oY + AZ = (a, oa a5 + a.) Cc (22124) 
and 

-L 

r*-N= ie + te + as) wie Steet Bae Ge ez”) (202A) 

- — - 2 3 1 Z 3 ' 
fe =cOollomws that 

-t 
}a| = lar + as 10 ae) ae x* + any* + ajz* - c) | (O20) 
1 2 3 au Z S ' 


Extension to higher dimensions follows the same arguments 
as above and leads to the general result of (2.11). 

The solution to the equations (2.1) which minimizes the cost 
function (2.9) may be obtained by considering a slight 
modification of (2.3). Consider the vector d with elements 


d. given by 


d = WAx - Wb (2.13a) 


ano 





ed i =miiNP = 


FIG. 2.1 DIAGRAM FOR PROOF OF &4q. (2.11) 
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where W is a diagonal mxm matrix of weighting or normalizing 


factors Woe given by 


ale ‘i 
mo Ue T ~% 


n 
Wen =D. Soleo! F (ava, ) Mae = 12, cucu (Zena?) 


Eq. (2.13a) may be rewritten as 


d = 24x = b* (2 3b) 
where Ase (2.15) 
and b* => (2.16) 


The solution of (2.13b) in the least-square-error sense is 


the desired solution, namely 


x* = [axtaxy7? asl ps (2.17) 


It should be emphasized that this method solves a different 
set of equations (2.13a), derived from the original set of 
equations (2.1) by normalizing each equation, using the 
standard least-square-error-solution. Thus the normalized 
least-square-error solution is a weighted least-square- 
meror solutz1on Of (2.1). TRis ts qurce drrterent L£rom 
other possible solutions of (2.1) minimizing alternate 


Soest functions [l8s,21], 1.e. 


where p iS an integer or 
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The following example demonstrates that the solutions (2.3) 


and (2.17) may differ appreciably. 


Example 2.1: Solve the set of 4 equations in 2 unknowns: 


{mae | 
1 1 
= 1 4 


0) 
0 = 
1’ >> {oa 
| 1] /-10 10 
mecording to (2.3) 

0) 

.099 


I> 
il 


ann 


Recording to (025) 


ae, 
oo 


These results as well as the lines defined by the above 
equations are shown; in Fig. 2.2. 

i Das new approach to the solution may yield more meaning- 
ful results where, the indiscriminate use of the least-square- 
error procedure leads to unexpected or undesirable results. 
Consider for example the problem of estimating the parameters 
of the semiconductor diode model [14] from measured electri- 
cal data. If the diode is forward biased the model essen- 


tially reduces to a resistor R, representing the combined 


22 


X2 


—1OX1-+ 10X2 =| lOX! + lOX2 =| 





Ol X | 


-+- LEAST SQUARE ERROR SOLUTION 
@ NORMALIZED LEAST SQUARE ERROR SOLUTION 


FIG. 2.2 EX. 2.| 
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body, lead, and contact resistance of the actual diode, 


in series with an ideal diode given by 


= 


= S| oe s 
Vp =k &n T. =~ k&n (TL) kin (IQ) 





Vp is the voltage across the ideal diode, I the diode 
current, I. the reverse saturation current and k the 
characteristic constant (for constant temperature) of the 


diode. If V represents the voltage across the actual 


diode, the model is described by 
V= IR + k&%n(1I) - k£n (To) - 


An estimate for the constant parameters R, k, and 
~kkn (1) is obtained from £n measurements of voltage and 
current for the forward range of the diode. The measure- 


ments may be written in the form 


A a qT gn (I) ~kk&n (T.) 
Vo | 1 I, aoe 4 | 
- | = ‘ 2 j R 
) e © | ! 
Min hi I. one Z ns | 


The least-square-error solution results in a model which 

is very poor for small currents and excellent for very large 
currents, because the weights attachec to the measurement vectors 
fe I; &n(I.)] may differ by orders of magnitude. Therefore 


the normalized least-square-error solution, which weighs 


24 


each measurement vector equally, is preferred, resulting 
in a model which describes the actual diode adequately for 


the entire forward range. 


B. RECURSIVE RELATIONSHIPS 


1. Development sof the Recurcave Romar oOncmromsan 


Estimation Problem 

In estimation* problems the least-Ssquare-error 
solution is widely used in the form of a recursive relation 
for sequential estimation where in addition to a set of 
solved equations one new equation is considered and its 
data taken into account**. A typical example of such an 
estimation problem is to determine the set of initial 
conditions for a dynamic system from a sequence of discrete 
observations. Consider the system equation given in dis- 


crete form as follows: 


ae Pe (248) 


i 


Zy M yy, (21503) 


where y, and y,_, denote the state vectors at discrete 

times kT and (k-1)T respectively. 96(T) is the known 
transition matrix of the system for the discrete time inter- 
val T. M is the known observer matrix of dimensions 1 x n 


and Za is the scalar cbservation at Fimewei.. It the stare 


* 

The term estimation is used to designate problems in- 
volving the solution of (2.1) where the elements of A are 
known exactly. 


** 
See Ret. “3, page ol. 


wo 


vector at k = 0 is denoted by the vector x the observations 


can be listed in the following form: 


Z M 
Oo Ree coe, 
zal Jae 
‘s — . x (2, 20 
Z. k=~7 


k | Mo 


Estimation of the vector x is equivalent to solving (2.1). 


Thus in general form (2.20) may be written as 
Z, = A,X (2.21) 
kK o 

If A, is of maximum rank, Lee [3] defines (M,%) as an 


k 


- Observable pair and (2.21) has a least-square-error solution 


aGeardingupes (2.Bdu 


5 24 T 
me PA Za (2.22) 


Where Z,. 1s the vector of k observations 


k 


Pel] 
A, the k x n coefficient matrix, A,=[M 1M |...,.0 ~M 


PL the imverse of the matrix AVA, 


= che weast-Ssquare-@¥ror Solution ror x Based Ween 
the k observations 


Now consider the (k+1l)st equation 


_ 
2? = 2 % (2.285 
where Z44 1 18 the new observation or data in the vector Ze 4] 
+ | 
and a" a row of new coefficients. a’ 1S equal to moe for 
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the estimation example above. It is now possible to define 





the matrix Alay and the observation vector 2,14: 
Ay 
Ay 3 ae v 2 = (2.24) 
Se 
The least-square-error solution then takes the form 
eae eee 
: wir. 
where Play is the inverse of the matrix [AL Ana! - This 


result can be written in the following recursive form [3]. 


a ee 
Sees ee T Verne ameee sarees 
eal Pa 
(2296) 
P.aa P 
P >i. sae 
k+l k l+a"P a 


It should be noted that (1+a"P_ a) is a scalar and is 
treated accordingly. The derivation of (2.26) normally 
available in the literature is rather involved and a short 


derivation which has been developed here is presented. 


-—| T : 

me T T T alt T 

k+l” “kia 5 SS 

-| T all 

ee = [I + aa P| PL (2.28a) 


Za 


After premultiplying by P, | 


(2.28a) becomes 


as Al 
Je eee oS Fy 


Combining (2.28b) with (2.28a) yields 


T -| T 
— = 
Pp = Pp Py [I + aa Pid aa P 


Kal k k 


The expression [ItaaB] a may be simplified. 


T =a 
Let eee 1 Beard Pid a 


T T we T 
then € (lave Pa) = [I + aa Py] a(t aie P, a) 


el 
[I+ aa“P,] [a+ aa’Bal 


-j 
[I + aa PL] 


[<I + aa’ Py] 


= a 
Thus }wesee= af(l + a°P, a) and (2.9) reduces to 
pee Pp 
“ _p _ _kS="k 
k+1 ke ae 
ae 
Then x = p az 
= See keto ks Sk+1 
T 
P,_,aaP 
x22 Fe In ; 
= |P, - —— /#/|A. 27 + Z a 
k L+alP _ |) kok k+15| 
Pia 
- T aes T 
= Ee ee T a PLAY Z, + 244 | 
ecee eect 
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and postmultiplying by P 


is 

(a.28b) 

( 29295) 
a 

( 2s '0 ) 

y aap a, 
Poa - 
k— 7 

eiece Pra 


z X, = = aX, The] or 
Ae ees Pa Te P a 
Pa 
= oa a k—- _ Jian 
+1 ~ qe 7 i Ze ee a) 
ec Pia 


(2.30) and (2.31) are the desired equations of the recursive 
merm (2.26) . 
By comparing (2.3) and (2.17) the recursive relationship 


for the normalized least-square-error solution follows 


pra” 
x* = x* + ————— (2* , - a* x* ) 
k+l a ta* Pea k+1 k 
(2.32a) 
Pta*a* Ps 
mic lame, ee 
l+a*'P*a* 
Now because of the normalization 
— 
ee ee 
(2. 325) 
- 
ay @- (aaa) | a 
Substituting (2.32%) into 82. 32a)™yibanis 
P*a 
A a — Ta 
f * = * se oR eR ae = * 
en So Tree oe ee 
a ata Pra 
(232 c) 
Ptaa Ps 
px = ps =) See 
k+1 k alas pes 


From (2.32c) it follows that explicit mormalization of each 


equation is unnecessary. Instead a slight modification of 


a 


(2.26), as given in (2.32), results in the desired algorithm. 
Tt should be. noted that (2.32) is not valid for the meaning- 
less Obsaruat ton with the coefficient vector a = 0, which 
must be excluded from the recursive procedure. For this 


case 


vk = ak 
26 eels 

CHa — an) (2.32d) 
px — px 


k+1 ~ *k 


‘In the following example sequential estimates for a 
vector of constant elements are computed from noise con- 
taminated measurements using both the least-square-error 
and the normalized least-square-error procedure. The 
measurement noise is derived from a noise population with 
zero mean and a distribution of finite extent (i.e. uniform 
distribution), rather than from a normal distribution, in 
order to conform with practical problems where the largest 


possible measurement error is bounded. 
\ 


Pexample 2.2: Estimate the unknown vector x of dimensions 


2 x 1 from the scalar measurements given as 


Zz, = Mx + v (2.33a) 


k k k 


where 


(2.33% 


tI 

= 

TH 
aK 


le 


My is a time varying observation matrix, Zz, the scalar ob- 


k 


servation and VE the measurement noise. At time instant k 
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the k equations, according to (2.33a), may be written in 


matrix form as 


Z, = A,X + VE €2..33c) 
where 
0 
1/4 
a ; : (2.33d) 
k-1, 2 
1 oo 
Zeal Z Z a (2.33e) 
=k 12 k ° 
* =n» my (2.334) 
k 1 2 ee @ k e 


Since x must be estimated from noisy measurements, 2,, it 


is necessary to solve the equation 
Zz, = A, x (2 Saeq) 


Ww 


age SOlUCiON Of (FY O33g) ror the eStifjiave OF x IS Given py 


aly T 

Xx, = PA, 2, (2.33h) 
eC ee (2.334) 

k Keak x 
The estimation error is then defined as 

= _ a 7 = Aly a ar _ 

% =X 7 3 7 Pape = Peel eee 

eet (238i) 
k k—k 


oF 


ay: Opes). 
PeoMmarsmrorea Specific case, where x -/ 9 and the measure- 
ment noise re is a sample from a uniform noise population 
with maximum deviation +0.1 and zero mean are shown in 


Figure 2.3 and Figure 2.4. The sum ¢€ of the absolute 


estimation errors 
e = [xt] + |x? (2.332) 
k k : 


and ca aces thie tLworelements of X, is shown for 


both estimators in Figure 2.5. Note that the estimation 


where xX 


error does not approach zero. 
The experimental result of Example 2.2 may be verified 
by considering a similar estimation problem where the 


sequence of observation matrices iS given as 


a 0 
ili 1 

A, as (2.33m) 
ie? ol 


The matrix P, in (2.26) for this problem takes the form 


ae 


wile 


Lor Keowee 


In the limit as k goes to infinity 


kim P= (2. 32mm) 
x 
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and ok = = v (2 .38I6)) 
ee seem jo. a’ 2 ei (94 
"k | 
Since Vi is a random variable, it follows from (2.330) that 


am 
k7o 


{Prob (x-x) = 0) = 0 


and the estimation error will approach a constant bias 
dependent on the value of the first noise sample. In 
Example 2.2 the result is quite similar and the estima- 
tion error will approach a constant bias with probability 
equal to 1. This bias is dependent mainly upon the first 


few noise samples as shown in the following. 


k 2 
k ele =) 
2 
aja = 
k 2 k 4 
r (1-5) year =) 
2 2 7 
i 4 Ik 2 
es 2 5 (1 
| k 1: k i 
2 2 
and Py, = & = 7 | 
4 3 2 
a P-E E (1-3) 1 
2 : k 2 2 
(2338) 


In the limit as k goes to infinity the elements of PL A@auoe 


go to zero but approach constant values. An approximation 
of ae PL may be obtained by converting the piecewise- 


constant functions in (2.33p) to continuous Functions and 


the summations to integrations. Thus 
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Since 


Also 


and 
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ea 


i IL 
{=[x-2£nx ae 


Ale 


di 
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iz oes on 
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I 
(1-=) + re 


2 
(1-5) “dx] } 





: k+.5 k+.5 2 
i al { {x-4anx -2 +4 - 9 -=[ (x- 22nx - al is 
x ox SIRs E.5 
2 -(1.5 - 4an(1.5) - pe + 45 - — 5 
: 1 32105 
,2im { (k+ 5) - 42n (k+.5) _ (k+.5)° {K* +5) on (Ke 6)- ={ Unlee wa 
ces ‘ ° k k ‘ k 
ane 4 
= 3s 3 eee [2nk] 
= 3.33 
Since 
2 ; 2 
Rim 4 : am 6 
k-7+0 kk PI) es k0 [eet] 
= 4 Sims, L(-p) + = (yey 4 Ly 
- k>© 2 k ." 
= 0 
Thus 
i “a 
Lim ale , 
a (2 .33q) 
and also 
Qim — &im Tr 
ko0 ~R * k+0 Pk 
Il, co 
“ees , : (2-)v, (2s) 
= eee 
- 5 7 9 
~ (3 - (+e, +2 3+ vi ae Ve } 


i 


Eq. (2.33)r 1S a weighted sum which places emphasis on the 
first few noise samples and whose weighting factors approach 
zero as k becomes larger and larger. Thus for large k 
the estimation error depends mainly upon the first terms of 
the summation in (2.33)r and approaches a constant bias. 
As a consequence a Example 2.2 the assertion in Lee 
Rim 


[eef. Bpepage 53] sthat i PL = [0] is contradicted and the 


estimate obtained is not consistent. 


2. (Apelication to Identutication Problems 
The foregoing results may also be used in identi- 
fication* problems. Consider a discrete-time system char- 
acterized by the discrete equation 
x, = AX, tag kote + Faye, thy Uy, ip tho Up_ot- ++ tb UL 
(2232) 
where in general n > m, and Xn and Up represent the 


system response and driving function at time t = (k-n)T. 


It is desired to determine the coefficients a; and Ds 


(where 1=1,2,...n and j=l,2,...m) from sequential measure- 


ments of the output. If the measurements z, ware noisediege 


ki 
then 


after k = ntm measurements the following data bank will 


result if the system starts with zero initial conditions. 


x : mee. ; . 

The expression identification 1s used to designate 
problems involving the solution of (2.1) when some or all 
of the elements in A are random variables. 
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Z4 Xo 0 0 Up 0 0 ay 

a 

Zo 2 xX Xo 0 | Uy) be 0 n 

2 b 

aoe 
aul Ant+m-1 ae eee 2 *m Yn+m-1 a Pa 
(2.35) 

or 
oy 

Zo¢m ~ !%nem } Une! oa (2.36) 


The identification problem is then to solve (2.35) for the 





wc. | 
vector| - 2 An exact solution is obtained only if the 
—m 
‘ | . ‘ . ‘ . 
matrix IX em | Bead 1S nonSingular. Sequential estimation 


is then possible using (2.26) or (2.32) when the number of 


observations K > mtn and the matrix [X | ] has 
— nN | n+m 


+m 


maximum rank equal to mtn. 
In the presence of measurement noise the observations 


Or the output become 


Z = X * V 


kee io ki 


where Vie is the measurement noise. Eq. (2.35) may then 


qe dL 
be approximated by 


a 70 : | sl 
0 i 
Z Z Zz ° a 
if 1 0 | U,. --2 (2372) 
: | 1 
| ; 
24 4 a2 Zon | Dn 
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Or 


a 
alae a a (mgs 7) 


This solution is proposed by R.C.K. Lee [3]. 
An alternate approach is to use an estimate for the 


element x. in (2.35),denoted by Xs in the matrix xX. Thus 


“wn 


a 


x. = [% . == 
X= (Xp ee) XLp tf Vey ce re (2.38) 
Dm 


This approach, as shown in Example 2.3, has a distant advan- 
tage over the solution of (2.37b). This example also 
demonstrates the difference between the least-square-error 
and the normalized least-square-error solutions as given by 


(2.26) and (2.32). 


Example 2.3: Identify the coefficients ay and by in the 


difference equation 


x = a,x, +tb._u (2.39a) 


k+1 nee e k 


from noise corrupted measurements Zea! where 


Zz ak tr) 


ees “Ice (2.395) 


ect I. 


and Va is a sample of the measurement noise with the follow- 


ing statistical properties 


E {veh 0 
it 


; (2.40) 
pM PP ony = fs ..c Sanyo ae 
1 i} 1, J ’ ty) 1 tea 


40 


where E{ } denotes the expected value, and o* is the vari- 
ance of the measurement noise. Specifically consider the 


discrete equation 
(2.41) 


For a given input sequence fu, sai uy F and zero initial 
conditions, the output sequence {Xo 0X1 _. x I is generated 
using (2.41). The output sequence is then corrupted with 
noise, taken from a uniform distribution with maximum 
deviation +0.1 with zero mean to obtain the sequence of 
noisy measurements {z.,z, ... Z,3 which then are processed 
according to Eqs. (2.26) and (2.32). For both approaches, 
one using the Z.'S and the other using the X,'S of Howie .38) 
as elements in the coefficient matrix X, two computations 
are made - one where the input iS a unit step and one where 
the input is a sampled cosine wave of unit amplitude. 
Typical results using the same measurement data for 
both estimators (the least~-square-error and the normalized 
least-square-error) are shown in Fig. 2.6 through 2.17. 
Figs. 2.6 and 2.7 present the estimates for ay and by ror 
the least-square-error and the normalized least-square- 
error solutions for a step input. Fig. 2.8 presents the 
magnitude of the total identification error €, = ee eo 


+ |b, (k) -b for both cases. Figs. 2.9, 2.10, 2.11 present 


1! 
the identifications using the estimated past values for Xp 
as given by (2.336). Figs. 2.12 threugh 2a) orescence 


corresponding data when the input is a cosine function. The 


4l 


results indicate that the identification error is generally 
smaller when the normalized least-square-error method is 
used and that the identification error depends upon 
whether the input function causes a significant system 
response. In the following it will be demonstrated that 
for the step input the Weentet emeteion error approaches a 
very large value, compared with the parameters to be iden- 
tified, independent of the variance of the measurement noise 
while for the cosine input the identification approaches 

a constant bias dependent on the variance of measurement 
noise. Consider the least-square-error solution of (2.37b) 


which takes the form 


mn 
ral 


(2.42) 
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where k > mtn. For Example 2.3 this may be written as 


is k-1 -l ek-1 — 
ay ., Zs 2 uz; 2 Z52544 
. i=0 i=0 i=0 | 
= | (2.43a) 
mr ie 1 k-1 9 k-1 
b Sy Uae , ue te ee. 
ide i=0 al i=0 nn i=0 L pall 
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Or -l 


resi rea ae 

- | il 2 il 1 

ay ee je ES a eee 
0) 0) 1=0 ; 

: (2.43b) 

ed ee veil 

- 1 1 2 1 

By ai ee ee To. Vig 
= =) a) 


In the limit as k goes to infinity the summations in (2.43b), 


provided that u, and x, remain bounded, take the following 


k k 
form 
aE ee Sema a Kot kv1 
we fe ee Xe Oe I ee 
i=o + i=0 i=0 i=0 
k-1l 
= eal peo: z 2 
eee 
i=0 
. k-1 k-1 k-1 
fam I i weZ2Z. = ATO 1 [ac Ug. + 2p 
Ne 00. Mass i 1: k>o k = ie 1. a aL 
i1=0 i1=0 1-0 
Lim Ae ae 6 
ere koe ab 
i=0 
; k-l k-1 k-l k-1 k-1 
im Tie = -& aaa 
NE cae ee pees Eile 
1=0 1=0 i1=0 1=0 0 
a ee 
keo k . Tat 1 
1=0 
k-1l ’ k-1 k-1l 
x an wee aim al 
3 noe = ne (2 ues, + Pe Wee ] 
k70 k fe itl k>0 k ra ee sg 8 i¢+l 
i ae 
ko oe | as 
1=0 


Then for very large k the following relation is obtained 


fee riiedl 2 224035 ) 


ci @ : u-1 . . 1 a 
y x, + ko Talc a. | 2 een 

i ea= (2.44) 
k-1 k-1 5) | 3,| ik-1 | 
2; aes arr; b | 2 eee 
ii=o 2 i Aare | oi eo Title 


The corresponding expression for noiseless observations is 


ao feel / |. 1 fkel | 
yy Xs 2 u;X, | eo x XsXsa4 
i=0 i=0 i | i=0 | 
| = (2 945) 
k-1 k-1 | a k-1 
, X.U d 2 b eae Uae 
eo i=o 44 : 7 1 eo Sie 
Oils 
ik-1 ; k-1 k-1 ~' @a 
i 2 Xm tke s U | a eX: ko 
oe s=0) 1 iE i=g 2 itd | 
: = | + 
k=1 k-1 a k-1 : 
eu. x. ye ee ay | owes | 0 
Pe coo We) foe te 


(2.46) 


The identification error for very large k is then obtained 


by combining (2.44) and (2.46) 
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k-1 k-1 “t 
> x.+ko De ee -~ko a 
sega i a 
1=0 1=0 
k-l k-1 5 
” u; A » u; 0 
| i=o 1=0 





(2.47) 


In the limit as k goes to infinity, Eq. (2.479) is easily 
evaluated by considering the corresponding continuous system 


wit the transter function 


(s) 


* 


it 
=rT (2.48) 


G 
nn 


Integration of (2.48) yields (2.39a) when the forcing func- 
tion u(t) 1S approximated as piecewise constant and the 


sampling time At satisfies the relation 


aie 
eS =a 


or 


}At| | 2n a (2.49) 


1! 


For the step input, when the system is initially at rest 


Wet = ell 
(2.2510°) 


“Ce 1 -e 


Then 
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This demonstrates that identification of system parameters 
from a step input is only feasible as long as the system 
response is not close to the final value, as shown in 


Bag). 2:56 
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Similar considerations yield the estimation error for 


the unit cosine forcing function. Using (2.48) 
UC ae a 
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and 


(2715 49) 


C25) ) 


(22216) 


(27 ) 


(2 35.8) 


The identification error as shown in Eq. (2.58) approaches 
a constant bias which is reasonably small for a large 
Signal to noise ratio. For the specific example (Ex. 2.3) 
this bias as obtained from (2.47) for very large k is 


See 


b sO 06 
for the least-square-error solution and 


= OREO 


b -0054 


for the normalized least-square-error solution. The total 
bias error reduction for the normalized least-square-error 
solution is) approximately 32% compared with the least- 
square-error solution. This agrees with the experimental 


results obtained as shown in Fig. 2.14. 
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ITI. THE BEST APPROXIMATE S@LUTZION 


In estimation theory, quite frequently one has to 
solve a set of inconsistent or insufficient specified 
linear equations. Since in these cases an exact or unique 
solution does not exist, an optimal or best approximate 
solution has to be accepted. Penrose [4,5] has defined 
this best approximate solution as follows. 

Betti tom: Xx, is the best approximate solution of 


the linear equation 
ao G, aan (@.1) 


where X and G are rectangular matrices, if for all X # Xo 


either 

eee ||| Pere aero] (3.2a) 
or ECX)-GI=Hl£0K,)-6|]_ and [XIE IX, | (3.2) 
where ||X|| denotes the norm of X defined as |/x|| = trace xx, 


In the discussion which follows, X is restricted to be 
a vector of dimensions n x 1 with real elements denoted by 


x. Then Eq. (3.1) may be written as 


Ax =b (3a) 


where A is an mM X nN matrix and bois an m xel vectra: 
The best approximate solution for x, according to 


definition 1, is the least-square-error solution if A has 


ae 


maximum rank, and the minimum-norm least-Ssquare-error 
solution if A has rank less than maximum. This solution is 
obtained by using a generalized matrix inverse developed 
by Penrose [4], and denoted in the work which follows as 
the pseudo inverse. The solution to (3.3) is thus written 


as 


= A’b (3.4) 


[> 


where x is the best approximate solution and at the pseudo 
inverter 

In the following the definition and some properties of 
the pseudo inverse, as given by Penrose, are stated. Then 
some alternate expressions for the pseudo inverse are dis- 
cussed. Finally a new recursive formulation for the 


sequential solution of Eq. (3.3) is presented. This form- 


-ulation has the advantage over previously published results 


(Wells [6]) in that the dimensions of the matrices in the 
algorithm remain constant irrespective of the size and rank 
of A. A flow diagram for the computation of the algorithm 
is presented and illustrated with a numerical example. 
Finally, the recursive algorithm is adapted to the problem 
of estimating the states of time-varying, linear systems 


from noise-contaminated measurements. 


A. THE PSEUDO INVERSE 


1. “Betinitronsand Propemeles 


Penrose [4] defines the following 
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Definition 2: Four matrix equations are Gerined 


AYA = (3.5a) 
YAY = Y (3. 5) 
* 
[AY] = AY (3.50) 
* 
[YA] = YA sw5a) 
where * denotes the conjugate transpose. These equations 


have a unique solution for Y. This solution is called the 
pseudo inverse and is denoted - Y=A, 

The essential feature of this definition is that any 
expression for the inverse of matrix A is established as 
the unique pseudo inverse if and only if it satisfies 


Eq. (3.5). AS a consequence of definition 2 the pseudo 


inverse has the following properties 


++ 


A a (3.6a) 

puta = atx (3.6b) 
+ Se . 

A =A if A is nonsingular (3.6c) 
Gas 2) Lae owed) 
(Axa)! = AA* (3.6e) 


a 
A,A*A,A’,A'A have rank equal to the trace of AA 
In addition, for completeness, define [1] 


0 = 0 (3 GeE ) 


2. Alternate Expressions for the Pseudo Inverse 


Tt is desirable to be able to express this inverse 


by a mathematical formula and the following results are 
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essentially available in the literature as discussed by 
Deutsch [1], Koenig [6], et. al. 
aes Ovencdccermned Case, M>nyer—n 
As shown in Chapter I, the solution (3.4) is 
obtained using 


=i 
at Ot Ajeet (329) 


which corresponds to the minimum mean-square-error solution 
Oia (3 cca: 
b. Underdetermined case, m<n, r=m 
The solution (2.4) is obtained using 
+ ie 
A = A [AA’] ( 3086) 
which corresponds to the minimum-norm solution. 

Equation (3.8) satisfies definition 2 and is thus the 
desired pseudo inverse. The fact that the solution is the 
minimum-norm solution can be demonstrated geometrically for 
the three-dimensional case as follows: 


Given two equations in three unknowns 

x 

y\ = (3.9a) 
Z 


Find the minimum-norm solution for the unknown vector 


[x y us 
Eqs. (3.9a) represent two planes 
Te = ate (3.9b) 
a b= Co (3a c} 
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where r is the position vector from the origin to the 


POL Ely zp 


a a vector with components 


and b a vector with components b 


2 ae 


21 b3 


alee 


1° 


Normal vectors towthe planes? are given sy a aneme 


Then any point on the line of intersection of the two planes 


Satisfies (3.9b) and (3.9c) and the desired solution is the 


point on the line of intersection closest to the origin. 


Let the vector from the origin O to this point N be desig- 


mated by ON “see ee ice Oe 


the vectors a and b 


ON = y,;a + Y5b ee) ORs ha 


where Yq and Y, are scalars, which 


a linear combination of 


- Y 

b - (3.9d) 
Z Y9 

Dy 


are determined from the 


condition that ON has to satisfy (3.9b) and (3.9c) as the 


Position vector r. 


Thus 


a 
| 
Q 


(ya i" YoP) 


Lo: 
| 
Q 


(ya + Yb) 


and 


han 
a 
ht | 


| @ 
Ss 
[oO 
[oO 


Gre 


| 
| @ 
| @ 
[oO 
eet 
| 
[— 
") 
Q 
_ 
ee 


(3.9e) 


zi = tL 2 


LINE OF INTERSECTION 





FIG. 3.1 MINIMUM.NORM SOLUTION ON. 


62 


Equation (3.9d) may be written in component form using 


(3496) as 
mal fy oP RMe Pay Pi 
N it 1 oe), x Aaa at 1 
Yui — 189 5 3 3. 5 ae roy 
ZN oo ed on ere 
or 
x a b a a a a b = 
N 1 a 1 Z S It il Cy 
Si aaa a ieee : a 20 
2N go ea | cee 
(3.10a) 
If the matrix A is defined as in (3.9a) 
A = hs - me (3.10b) 
di Zz S 


the result (3.10a) may be written in the form of (3.4) 


' Cc 
+ 
vy |= A : (3.10b) 
tly 
on 
T 


where AT =A ated ae 
This shows that in this case the pseudo inverse in (3.4) 
results in the minimum-norm solution. 
c. Underdetermined case, m>n, r<n or m<n, r<m 
The solution (3.4) is obtained using either 
-] - 


AY = AcN IN AA oN (3.14a) 
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al 
tr T ee - 
Or A =M [MA’AM’] MA ( 3astA5) 
which corresponds to the minimum norm solution of minimum 
Square error. Matrices N and M are defined as factors of 


A [1] 
A = NeM (320) 


where matrix N is of dimension m x r, and matrix M is of 
dimension r : n. The rows of N are chosen such that they 
constitute a set of’base vectors for the column space 
Spanned by A. Matrix M is then the transformation of N to 
A. Its dimensions are necessarily r x n. For example, the 
columns of N might be chosen as all the independent columns 


in A. The pseudo inverse is then given as 


+ 


nS mmm y~2 Per 


[N'N] -N (3s 


because (3.12) satisfies all four equations in definition 1 


as indicated below: 


(1) ata = mM’ (um?)7+t py) ty ym 
= mM’ (vm?) ~1m 
a 
(2) aat = nm mM? ym’) 7tpnty)7*N? 
= n(w'y)4{n" 
[AA*]~ 
(3) aAAtA = nM MoM’ )~* n'y) ~*Ni 
= NM 
=e 
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=i 
in) tN Mm! pym ) 7+ ny] N 


mony tw? 


1 a 


(4) aAtaa* = wm? tym?)7 
m? (um? y~ 


a 


il 


Expression (3.12) which involves two matrix inversions may 
be simplified further to expressions involving only one 
matrix inversion. Since both matrices M and N have rank r, 


(3.11) can be solved for either one 


leat 


M [NoN]7 NA (3. 13a) 


am~ tum!) 7 (28185) 


Z 
Il 


Salestitution of (3.13a) into (3.12) yreles 


A’ = AN (NAD MI NE (3.14a) 


and substitution of (3.13b) into (3.12) results in 


deere 


a z am: ]~*MA (3.14) 


at = wm tmat 


Pia expressions, (3.12), (3.14a), ame (emi are Va lak 
general expressions for the pseudo inverse. However, if 
the matrices involved in actual computation are to be as 
small in dimensions as possible, (3.14a) and (3.14b) should 


be used as follows: 


(1) m<n, r>m 


| -1T 
at = alwiwiaa™n) Nn 
(2) m>n, r<n 
ToT ep 
at = m-tmatam’]~~MA 


B. RECURSIVE ALGORITHM FOR THE SEQUENTIAL LEAST-SQUARE FIT 
In Chapter I a recursive relationship for sequential 


estimation based on the equation 


b = Ax (3.15) 


and its least-sgquare-error solution 


1 


x = [A’A]"“A'b (3.16) 


is given. However this recursive form is only valid if the 


matrix fata] is nonsingular such that its inverse p = [(ataj "2 
exists. 

Consider now the case where no assurance as to the 
existence of the inverse can be given. Using the pseudo 
inverse it is possible to write formally 

k= A’b 
+ + 
= A*[aA’]*b 
(3.17 
= ataltaT, 
= [aj*als 
or x =P Ab 
where P is the pseudo inverse of fata]. The dimensions of 


the matrix P are always n x n independent of mor r. This 
1S Significant because in a sequential procedure both m and 
r increase as more data are incorporated. Therefore, the 
use of an algorithm updating the matrix a as proposed by 
T.N.E. Greyville [22], may not be practical for sequential 


. = 
estimation because the dimension, m, of A grows at each 
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step. Alternate methods [7,8] using an updating procedure 
for A’ until the matrix A has dimension n x n are considered 
below: 
Ch) Dinrect.updaring wor Ae 

At each step, as additional measurements are incorpora- 
ted, the size of A grows one column for each step. C. H. 
Wells [7,8] presents an algorithm for the updating of a 
However his procedure has the disadvantage that when initi- 
ating an estimation problem the rank of A must increase at 
each step until maximum rank is reached. This is not the 
case if the first n equations in (3.15) are dependent. 
(2) Updating aye using (3.8) as long as m<n. 

A recursive algorithm or direct computation based on 
(3.2) is possible only if the square matrix AAs of growing 
dimensions, remains non-singular. Thus the rank of A has 
to increase at each step which may not be true. Further- 
more, a recursive form could be used only as a starting 
procedure up until the matrix aA’ has dimensions n x n. 

Consequently, it is desirable to find a recursive 
formulation similar to (2.26), where all matrices involved 
have constant dimensions regardless of rank or size of A. 
This result is accomplished here with a recursive form for 
the matrix P, where P is the pseudo inverse of Ne 

In order to derive this recursive algorithm, consider 


the set of equations 
z colt: 
dl oa 
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where the subscript k denotes the number of equations. 


Recome Eaemoolutionsoetethe form, (3.17a) 


oe epee + T 
X= [ALAR AL 2, 
(35.19a) 
a i 
= PAL z. 
where P, is the pseudo inverse of ALA, and X the® best 


approximate estimate for x based upon the last k equations. 


Then at the next step (3.18a) takes the form 


—k A, 
— = AL 1X = is x (3.18b) 
k¥1- a 
‘ Le 
where Ze] 1s a new scalar measurement and a is a row 
. vector of coefficients relating the observation Za] t@ xX: 


The solution to (3.18b) is then 


A (3. Ua) 


x = P 7 Z 
—k+1 k+l “k+l =—k+t+l1 


In order to find an alternate expression for (3.19b) let 


“Aw 


x= x +t Ax , (3.20) 


Substitute (3.20) into (3.18b) and premultiply with ay 


tO Obtain 


Z 


T 2x T ~ 
A eoS_ |) = Ae SA ei 4 Ax), 
el |Z rae ene eae 
Or 
T ase TA 
RR Hy 1 ole ee | apy 4s) - 


68 


Bue 


+ 
ae 
Thus 
T + T. ea i 
AL 2, ~ Pix, + (Za ee x.) a = [P.. + aa JAx (3.2la) 


The term Ay Zy = Py, can be shown to equal the null vector 


as follows. According to the defining equations of the 


pseudo inverse, (3.5), 
= [ALA] "Ay 
= AVA, AL 
= Ay LA,Ay 1A, Ay 
= atatT ATA, A, 
= [AA] (AA, 1A, = Bare . (3022) 
Also since PL. - Py and Py = [Pr 
PPL = (PP, ie 
= PYPL n>) 
Ween using (3,22)and (3223)7, 
Az ~ Pky = AyZac 7 PLP AZ 
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a Javea 


= 
= {I - Pi TIAL 2 Zy 
= {I - oe SPUR Ata = ; (3224) 
Then Eq. (3.2la) reduces to 
(east) 2 — (et ang) wee (3. 21b) 


ime solution of (G.21b) Lor Ax is obtained as follows. 


(1) If P, has rank r <n and ear Py yr) a #0 


the solution 


[I-P, plea 
dx'1) = ami (z,,, - ax.) (3.25) 
= a rI- pt la k+1 — ~-k 
kPk 
Satisfies (2.2lb) by inspection. Ax?) is not defined if 


either ‘ is of rank n, which implies that [I- -Py Pd = 0, or 


dt | oe -Py K! a= 0. 


: alg a 
(Qa) If Py has rank r =n, or if [I-P,P Ja = 0 


the solution is given by 


(2) k= 
Ax = ———— (2z - 
= lta™P a k+1 


(3. 269) 


SUpstleution of (3.26) into (3.21b) yremee 


Ta 
(2) _ *k+1” = >& 
a 


ete Pra 


[Py+aa‘] Ax a (a" Pp yo) a- [I- Py Plas (3.273 
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k 
[I-P’P Ja = 0 (3.27) reduces t6°9 (3e7!)). 7 Usingwthe above 


, + 
Since either [I-P,P)] = 0 when P, has rank r =n, or 


two possible solutions for Ax, which according to the 
conditions (1) and (2) are mutually exclusive, recursive 


forms for the solution of (3.18a) are established. 


(1) Recursive relation if PL has rank r =n, or if 
cps = 
ee PLP Ja 0 


Pom (3.20) and (32.26), 


Bia. a aX) (3.28a) 


Combining (3.28a) and (3.19b) determines the updating pro- 


cedure for the matrix Pe? 


P_aa Pia 
A ih k—— Tt k= 
= = P,A, Z - ————- PAZ + Z ——_-_—_ 
==le all k k=k l+alp s kU k=k k+l ltarp a 
k= — k= 
T 
ee ae KS Ay 2 me Seep — 
l+a Poa lta Pa 
— — “k= 
Since 
T 
Pr. aa Py 7 Pra 
ae 2 ae 
steel Pra ee Pra 
. PYaaP, 
x = [P, - ] [A Zoe a] 
—k+1 k l+arp z k k+l 
See 
Also 
x = Pp [Al 2 + 2 a | 
See en eek p= 
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Ehaus:; 


kr 


T 
a ae 


lta’P a 


k 


(32726b) 


The new matrix satisfies the defining equations for the 


pseudo inverse 


3.5) Ens (3.26a) and(3.28b) 


the required recursive form 


(2) Recursive relation if P 


x + k= (z ae ) 
—k l+a’P a k+l — —k 
oF 
> _ -&22 Fk 
k 7 
ee Pra | 


” has wmank r<n and 


constitute 


(382) 


[I-P)P, ] a ¥ 0. This condition excludes the solutlongeig@e2a7, 


which does not 


“Nw 


c+] 


For notational 


244 


Satisty (ami). 


+ 
S a = alee 
—k iP + k+l — —k 
a [I-P, P Ja 


convenience define 


“- 
[I-P, P Ja 
a + 

a [I-P, Py Ja 


and note that S44 has the following properties 


a 
2 Fk+1 


eee 2 


T Z 
ser = 


T 

—— +S — 1 
+ 
ae 
T <p acine 
oo 


Ve 


Then from (3.20) 


and (3.25) 


(3.29a) 


(3230) 


(Sra) ) 


The desired updating procedure for P 


K+ ] 1S found by ee 


bining (3220 )}erond (3425) 


nN nw ees 


Seep ee ee 


T pT oo 
kee ee Se eee 


it 
42, 
> 
N 


ae los + 


[Py Gey @ PL IAL Z, + 247547 


Also, 
x = [P jaz 1 ag P a 
—k+1 ‘ee lb —k k+1l°> k+1—- 
Then Pi4z must satisfy the following conditions 
aaa st po (3. 32a) 
k+1 k+1 : 
at : oe 
Simce ae ee] iS a SYMeieErCe Maeiic. 
ie _ T az 
(b) PL AK = [Py 9.472 PIA, (32525) 


A possible solution satisfying the above conditions is 


© 


i si 
ame gee ch cre 


T 
Rese meee aS re 
(3.29b) 


Assuming symmetry of P Eq. (3295) satisfies “C3. 32a) 


les 
by “inspections Using (3.22) and" (3.31), Bee 295) eae 


be shown to satisfy (3.32b): 


T a T 
Pte = PPR SR 12 PT kt Fe 1k 


Since 


T oT + 7 
Demi > ween A) 
and 
T on 
CeO: 
then 
Tf T T 
Prada, = [I-g,4 a P, JA, 


The last condition (3.32c) is also satisfied by (3.29b) since 


ae TT. 5.7 T T 
Pieiae™ Pp” Syn Pee 7 Paget Chay P) als ania ae 


_ “ aL a a 
eR ore a Pee ie eg caer eT 


Hence 
Pxt12 = Shei 


In order to prove that (3.29b) is indeed the correct 
: eer + 
and unique expression for the pseudo inverse PL tA Aa! 
Pad has to satisfy the defining equations for the pseudo 
inverse, (3.5). Proof that the equations in definition 2 


are satisfied follows: 


Puoor. Using (3.31l)aand (3.23)4 


(1) 
ee Ee me ae ee 
= LPL - yey 2 P,P + Pyas “Hyp (a P_aa 
‘i PRAT aa + (14a"P, 8) Gy 414122 
= PLPY - Sey12 PEPE + ad (3232e) 
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[T=P Pe jec lt=Peren 
P pe = p oe 4 2 See See (3.32b) 
lee ioe T 


Ee 
a [I-P, Py Ja 


+ 
Thus Pe kes] 1s symmetric and 


A) a + T 
eee ¥ PPK! 


+ 


- + Th 
(2) Peg? = FPpa Paar! 


’ + + 
This follows since Pay! Pra? Peed] are symmetric. 


C)eebsing “(moze 


fe za fe + 
Pact eect = Peer Pee Ker! 


aD A 
te eres [I-P, P, Jaa [I-P, Py] 
= [Pitaa ee ee 


k ik aa 


7 
an toe ee ele 


oot. ot T. + a 
= PPP + aeyP,P a 2 oe e ee 


+ + + 
Sia ee aa 


CP elising (332a) 


* s 4 
Pi ict etd ~ Ppa Ppa! Pe 


rc 


+ T r T T 
PLP + 94a ULOP Pl erg? PeeRee Se in 
T T 
Hoe Lay Py ay 49 Sy aa 


+ a T + aT le | T 
PL PLP Fyne Tego Pym~ PyPL Ppa Tey + (La Ppa) ye Ge 41 Fio) 


"5 


ess 1 T T T 
ee yee Py, ~ Pada + (ita Plaid) gi 


+ 
S\omlecaia dent 7 eae 

This concludes the proof and (3.29b) is indeed the desired 

updating procedure for the matrix Pye 

To complete the recursive algorithm for the solution of 


(3.18) a recursive form for the matrix R 


- 
1 om [I-P,P,] has to 


be established. From (3.32a) 


a + T 2+ 
Petty = PKPR + Se4e72 [1-P, Py] 


then 
i Te, See 
CaP Papa) = EEoPLPL I ~ Seyya Po PL Py] 

and 


aR (3.328 


- > Fes, 2 Re 


Kau 


Note that the matrix R,. remains unchanged if the recursive 


form (3.28) is applicable because, in this case, 


- 
P,aaP 
+ ” i + T 
oleae 1) ae etl T Kee 
dihia Pra 
Jk dl 
P,aa P a Pa 
= P,P) + P,aa’ - ——* p? - —_*_ paa? 
lta Pia lta Pia 
- 
P,aa 
+ 3 + — _ + 
Pehl kt1e wok io SEE e 
l+a Pia 
— k= 
Since either 
Saas lr AT 
[I-P, P,] = Q OF ata PLP] = 0, 


"6 


then 
+ i 
Ee era ea 


and 


A ee 


Then the complete recursive algorithm, if Py has rank<n 


and [I-P, Pola #0, is given by (3.29a), (3.30), (3.29b), 


and (3.33), which are summarized as 





R,@ 
oye = (a) 
a Ra 
Sele 
x = xX + (z = ae ) (b) 
+] SRS TRS eS 
(3.34) 
= _ ae a eal att 
Pict) = PD 12 Pym Pp ageg 7 te PLT ee §C) 


q d T 
a ere ye (d) 


where R, = [I-P 


which is equal to n-r, where r is the rank of A, and n the 


mae is an idempotent matrix, the trace of 


maximum possible rank of a computation flow diagram for 
the recursive algorithm (3.28) and (3.34) is given in 
Fig. 3.2. Note that if the normalized least-square-error 


solution is desired, only Eq. (3.24c) has to be changed to 
Ee a fh . if? T | | 
Pear = Pm Tea Pym Ppa Tey t (a ata Ppadg G4, (3+ 34e) 


The complete algorithm, Eqs. (3.28) and (3.34), ws) 2ieac- 


trated in the following simple example. 


va 


LEAST SQUARES FIT RECURSIVE ALGORITHM 
COMPUTATION DIAGRAM 


aki 


YES NO 


A 
Xn2(3, faa )a 
NO n 3, /a9 y ; 
trace Ry) >.5 R= a0" /(ola) 










R,= I-ga"/g"g 


RETURN 


CRIT = a"R, a/a'a 


"CRIT 105 


NO 





USE USE 
EQUATIONS EQUATIONS 
3.34 3.28 
RETURN RETURN 

FilGeorz 
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Example 3.1 Given the four equations below, calculate 


the best approximate estimate for x sequentially. 


1 = {1 O]x 
2= (3 O]x 
3 = {1 EX 


4=. [2 Six 


Following the computation diagram in Fig. 3.2 the results 


below are obtained 


a) oa" oll Zz, = ib Ee = (1 OQ] 
- 1 0) Be) 
See : Decne : Ree 
= 0 7 0 0 1 1 
trace Ry = l 
c 
b) k = 2 Zo = Z a = [3 0} 
trace Ry = 1, CRIT = 0, then vusem(3a28) 
Pia 8 
xv. = & pei tee 
=é —1 lta'Pa 0 
Pp aa I 8) 
1-- 1 
a ss 
eal Poa 0 O 
;o 0 
RS =—5 ae 
2 | 0 1 
Erace R, =e 


ae 


Cc) k = 3 Z. = 3 a 
trace Ry = 1, CRIT = .5, then use 
— Roa D 0 
23 aT _ 
a R5a5 1 
X3 = Xo + J, (Z37a X>) = 
P. = PA - a P eet 
3 DS 23 
0 O 
R, = 35 wg a R, = 
3 2 3 2 0 0 
Eeace Of R = 0, thus all following 
aeeerding to EG. (3.28). 
d) k= 4 ee ao 
Pia 
an x k= T 
Se = ee (z,-a xX.) 
=4 —3 if. Pee 4 ae —3 
So noe 
Pjaa'P 0952 
_ —— 3 
Pe me = 
eee Pia -.104 
The vector x, is then the 


given set of equations. 


ee 


(34.34) 


3 
il 
a 
T TE 
(l+a"P5a) 9393 = 
fa ae | 5 ie 


equations are processed 


1] 


eee 


6.67 


-.104 


~/14 


best approximate solution to the 


The recursive algorithm presented here is more general than 


the one used in Chapter II, because it includes a starting 


procedure. 
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Regardless of the rank of A the estimate 


obtained is the best approximate solution according to 


aqefinitwon iL. 


C. ESTIMATING THE STATES OF A LINEAR DYNAMIC SYSTEM 

An alternate interpretation for the above recursive 
algorithm is that of determining the constant state vectem 
x for the following system from a series of noise-contam- 


moaatead scalar measurements: 


sys tem: 


I 


X41 ae 


Measurement: a] Met eke] 4 Yk+1 


where Mea] is the time-varying observation matrix 


and Vea] is the measurement noise. 


In order to be able to estimate the state vector for 
a dynamic system, where the transition matrix oT ee is 
in general time-varying and not equal to the identity 
matrix, it is desirable to develop an algorithm similar to 


(3.28) and (3.34) for the following systems and the scalar 


measurements 2Z 


System: d 


(3.35a) 


Xt] ~ *k+1,k%k 


Measurement: Zig M4 Xk41 + Vea] (3.44355) 


For the case when the system of equations for the estimation 
of x, is determined or overdetermined (P, has rank n) Eqs. 
(3.28) are easily adapted to include the transition matrix 


[3]. Let the equation 
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Z 


xr ey (3.36a) 


Kg kk k 
be valid at time instant k, then the solution is 
ae oc -1.T 
X rani [A Ay ] Ai. 2, (3.36b) 
and 
ae ph = 
Py a [A A, ] (3.366) 
Using the property of the transition matrix that 
[4 ao (3.37a) 
k,k-1 k-1,k ' ; 


Eqs. 


O41 = 


oe ake 


= O41 kek 


= 9 


(3.36) at time instant k+l take the following form 


(3 Bb) 


T 


T 5 oT 
k kt kk? k .k+1! 


le eee 


Il 
A 
k*k 


Nn 


(3% 37e) 


oe al 


= 
K kt kak ke  k+1! 


T 
k+1 kk k4+1,k (3. 37d) 


Thus whenever the matrix PL has rank n, the valid recursive 


algorithm for the dynamic case, including the new observa- 


tion Zea 1s 
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il 


Cee = el eee 
0, .M-..M ..O 
' ct i ke 
P = 0 ee (ise 6 ) 
a a oa, eee 
+1 2k es 
‘ib 
; = an Oe ed 
al k+1,k2k 7 7 ea 
Mt eke eked 


Consider now the case when [Ay Ad] is singular so that the 


pseudo inverse 


Eee 
Py = [A.A] ered ) 


shemla be used. At time instant #«+1 


T + + 


Ser = Ek Kt 1Pk OK kt! aa 


A comparison of (3.39) and (3.40) can easily be made using 


(3.14a). Let 


then 


a i a Eo ellen 
A M [MA AM | MA. 


I 


Also define 


B, = A, ® 


k Kee L 


then 


val 


T 
J oM 9% yaa AK 


ar a: aE Ee 


- T T 
aie Oe eee, oe 


Thus (3.39) and (3.40) may be written as 
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aa 
oT T aap T 
ea > eee meee ee OM Oe 


This reveals that there is no simple relationship between 


T T + 
eae ie £41) oe 


Thus (3.34) cannot easily be adapted to yield the best 


the matrices [AAJ and [94 


approximate estimate for the state vector at k+l. However 
an acceptable alternative for a starting procedure is to 
estimate xX) sequentially using (3.28) and (3.34) until PL 


reaches maximum rank; then (3.38) may be used. The inter- 


mediate estimates when Py Nas ranks rsieare given te, 


N w~ 


Se Spee oc 
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IV. FINITE TTERATILONSMETHODS 


In previous chapters methods for the solution of Ax=b 
and for the pseudo inversion in recursive form have been 
presented for the solution of the sequential-estimation 
problem. In this chapter, finite iteration methods for the 
solution of a set of linear equations and for matrix pseudo 
inversion are presented. These methods are based upon an 
infinite sequential error-correcting scheme, proposed by 
J. Nagumo and A. Noda [10], combined with the Gram - Schmidt 
process [9]. The derived methods require only a finite 
number (equal to the rank of A) of iterations. This 
approach has also been considered by L. D. Pyle [23] and 
some of the results presented in this chapter are similar 
to his. For the proper use of Pyle's algorithm it is 


necessary to rearrange the given set of equations whenever 


the constant by in the first equation, ax = bis is equal 
to zero. Since this may not always be convenient in practice, 


an alternate algorithm is presented in which the computation 
sierts unconditionally. Section Awpresents the baste 
iteration procedure with geometric interpretation. In 
Section B(1l) this method is combined with the Gram - Schmidt 
process resulting in a procedure for the solution of a 
consistent set of equations. These results are extended 

in Section B(2) to solve a set of inconsistent equations 

and the solution is shown to be identical to the best 


approximate solution according to Penrose. An alternate 
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method accomplishing the same result is then derived. [In 
Section C, the foregoing methods are extended to solve for 
the matrix inverse, when it exists, and for the pseudo 
inverse. In Appendix A these results are applied to the 


iterative solution of a set of non-linear equations. 


A. INFINITE ITERATION PROCEDURE 

Consider the problem of solving the following consistent 
set of equations. The term consistent iS used to denote 
the fact that the system of equations is assumed to have 
either an exact solution or a unique locus for the solution. 
Alternatively b 1s contained within the vector space 


spanned by the column vectors of A. 


Ax =) An 


Let ae represent the i'th row of A and bs the i'th element 


in b. Then (3.1) may be written as 


Pa by 

i 

a5x = bo (4.2) 
oe = b 

m— m 


The iteration scheme proposed by J. Nagumo and A. Noda [10] 
solves each equation in (4.2) Sueeeeativeally for x by adding 
towedeh™apprettimation “for xa Corpree@tion ofVappreprmate 

Size in the direction normal to the hyper-plane in x-space 


represented by aliens = b.. After solvingtthe m'th equation 
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the process starts over again with the first equation. het 
the i'th approximation for x be denoted by x. and the 
Initial CSti mate whereas, Kee QO. Then the method may be 


presented as 


T i 
145m ~ Xo+4m 7 1°21 %044m) a 
STi 
a 
7 tT 2) 
o+jm ~ 214jm * 2722%1 45m) TT (4.3) 
aoe) 
a 
= ase a 
=n+jm a Xm-1+5m? Pm SnAm-1+jm) 7 
=i 


Eq. (4.3) may be derived as follows. Consider the i'th 
equation in (4.2) and assume that the solution has the 


form 


x5 Xt Ae (4.4) 


where x, _j is the solution for x obtained from solving the 
(i-l)st equation. Combining the i'th equation of (4.2) 


with (4.4) yields 
T ee 


Ble —l1 Gal 


which may be solved for AX, using the best approximate 


solution according to Penrose, namely 


(4.6) 


Then Ax, is the manimum norm solution of (4.5), ane 


8 7/ 





i eT es ae (4.7) 


This describes the equations in (4.3) before the iteration 
process starts over again with the first equation in (4.2). 
The convergence rate of this iteration scheme, although 
quite rapid at first, decreases asymptotically towards zero 
as the solution is approached. The limit is the exact 
solution if the system is determined, that is, if the rank 
of A is equal to the number of unknown elements in x. How- 
ever, if the SyStemjis undetermined, @che minimumenexm 
solution, as discussed in Chapter III, is approached since 
Gach Correction Ax. is in the direction normal to the plane 


described by the i'th equation in (4.2) 


Be BINITE £TE RATION” PROCEDURE 
1. Sets of Consistent Equations 
The foregoing iteration scheme requires an infinite 
number of steps to converge to the solution. If the process 
is truncated, only an approximation is obtained. As will be 
demonstrated this difficulty may be remedied by constraining 
the corrections to be orthogonal to each other. 

Again consider the set of consistent equations (4.2) 
where each vector a; is normal to the hyper-plane described 
by the i'th equation in (4.2). These normal vectors are 
generally not orthogonal to each other. However, using the 


Gram-Schmidt process [9], an orthogonal basis for the vector 
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Space spanned by the vectors a, @ = l1,...,m) may be 
obtained. According to this procedure a sequence of vectors 
Os is constructed from the set of vectors a. in the follow- 


ing manner: 


—l —l ail 
—2 —2 eo —l 
a 
as = 
—k —k , at — 
i=l oe 
a1 


then the set of vectors Os cons sts», of Mutually @eukhegona 1 


Cia 
oO. 1s the component of a 
alg. 4 —k 


— J eae 


imeethe dimections of Qe which is subtracted from ay leaving 





vectors only. Note that 


the normal component to hss In recursive formulation this 
orthogonalization process may be presented as follows. 


Let {CC »2 Cy} be a sequence of n x n matrices with 


io 


eo = I, then the set of mutually orthogonal base vectors On, 


eee oObtainea from 





Oy, C48: ite = @, teaealculace om using a.,4 
04, 4, 1 = Moo eam 

> k—lehgere 1 es sea 
—K=k 


If the process yields a zero base vector, the corresponding 
vector a. is a linear combination of the previously defined 
base vectors, and the i'th equation is a linear combination 


of the previous equations. Therefore the i'th equation and 
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the corresponding zero base vector may be disregarded, so 
that, finally, there are only r independent equations and r 
orthogonal base vectors, where r is the rank of A in (4.1). 
The exacgitesoduitryon TLomeaqjonsthe minimum=norm solution fer 
x, 1£ the system of equatiiens is undéemdetermaneds is 
obtained as a linear combination of these r base vectors. 
Using the form of (4.7) a correction Ax, 1S made success- 
ively for each of the r mutually orthogonal base vectors 


T “k 
He ~ ee FP ” Seek ae oe 
Kk 


It should be noted that the process (4.10) terminates after 
the r corrections are made. Thus the infinite iteration 


process of (4.3) essentially reduces to an r-step process. 


These results are summarized in (4.11) 








= Cay dale a = jae Caleimlate O using a,. 

O at 
» sets a 
is eee Laon 
—k—k 

On T el ae 

i Se ee SP ee ile eal 
a, O 
—k—k 

(4.455; 


Note however that the first equation which starts the pro- 
cess has to have a nonzero element, Die in thes yec tom be 


because if by = 0, xX} 


ie by = 0 the process may be started by either choosing 


another suitable equation of (4.2) as the first equation 


= 0 which #s Snot "coOnrect=gencraliy. 


30) 


OF Dy winftia laine X 


a. 


Seno Para lle ly 6O aa 


a compact form as 


J 


where P 
O 
ih eae) LOT 
compare the f 
least-square- 
illustrates t 
Example 4.1: 


equations for 


il 
a 
0 


x 
=O 


0 


i 


otep Il; 


=a., where a. x a. # 0. That is, 
=a) a —1 
i: Eqs. (4.11) may be rewritten in 
P a 
k-1~) 7 
x a co (b> — a.X,_)) 
—j} k-1-3 
T (4,12) 
P a.a.P) 1 
aed J) - 
Prod ae : k =p a 
oa) 
and the index j denotes the succeeding equation 
which a, Pya, #0. It is interesting to 


orm (4.12) with the form of the recursive 


error solution (2.26). The following example 


he use some 4.21) 


Use Eqs. (4.11) to solve the following set of 


A. 








Z x4 1 
0 X5 | = 0 
1 on 0 
0 C =] 
O 
IL 
i Oe) ap 
. On lo | / 1 2 
ae I (b ae ) <_ sce | O 2 i 0 = | 0 
=1 ~ =0 12150’ “F-* | 5 i, a ve 
Oss | 0 1 2 | [24 | 
Os Os 8 0 -.4 
C.. =c = Ss Oral: 0 
: Z Or OF -.4 0 oe 


ou 


ao 
Step 2: &, se ae. a" 
4 








2 i-7 ae 
a, 2 1 8 ile 
Spee ee eS) rep cl ale™ Son lcadeelitimie 
ana 4 -.4 4 
—2—2 
Oty Oy , | 4 -4 -2 
C. ae Ch a T = 9 -4 4 2 
OA On -2 2 1 
-10/9 
SEep 3: a, = Coa = mau 
a 1 -.6 z 
= _al —3 Lyj) 24 =e 
X, = X5, + (b3,-a3X5) a = 5/71 5 eee 
Cr 4 4 ~ 4 
—3=3 
O40, 0 tr 0 
Ca = Cae = |0 0 0 
eee 0 0 0 
—3-—3 


Xx, is the unique solution to™the given set®ot  equaupore: 


5 
The iteration sequence as well as the planes described by 
the set of equations are shown in Fig. 4.1. 
2. sets of Inconsistent Haquations 
If the set of equations (4.1) is dine onals Caee as 
is usually the case in estimation problems, "themvector b 
is not completely contained in the vector space Seana by 


the column vectors of A. A solution may be obtained by 


solving the equation 


Asx — 20 (4.13) 


Or 


ILLUSTRATION FOR EXAMPLE 4.1 


\" 


\ 





FIG. 4.1 


go 


Se) 


where by is the part of b contained in the column space of 
A, or the Projection Of b -anjeo the column space of A. The 
remainder by =b - ba GCepteaemeas that portion of b which 

is ignored in the solution, and is orthogonal to the space 
of A. This represents an optimal choice for ba which may 


be shown as follows: 


Let the vector b be decomposed into two components 


b = b, + bp (4.14) 


where ba 1s contained in the space of A and by, is the part 


of b whieh is ighoréd an the Solutvonepmeecoc. If x is 
the solution of the consistent set of equations 
Db, aa 

then || b-Ax, | a |b, - AX, + be || > | bs. || 

m T 2 
where |[b, || = 2% (by - aj xy) 

i=l 
The minimum of |b,|| is obtained when b, = b Then the 


by 
solution for x of the inconsistent set (4.13) is, by 


definition, the least-square-error solution. The standard 


way of achieving this projection of b is to premultiply 


(4.2) by al. 


at ax = Ze 


ll 
ao 
oO 
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but since 


Dag 


AYAx = Ab, (4.15) 


Since (4.15) is a set of consistent equations, it may be 
solved using (4.11). 

Another finite-step process for the solution of (4.1) 
which starts unconditionally but involves a little more 
computation may be derived following arguments similar to 
those which led to (4.11). Again consider the equations in 
(4.2). In order to obtain the minimum-norm solution (if 
the set is underdetermined) the desired solution may be 
represented, as in the previous section, as a linear com- 
bination of the row vectors of matrix A, or, equivalently, 
as a weighted sum of different base vectors representing the 
Same space. The set of Eqs. (4.1), if it is assumed to be 


inconsistent, may be written 


b = Ax + ba (4.16) 
where again by 1s orthogonal to the space of A. 
Now let 

DS \ Aa, + hj Ad. + os Se h Aa, thy (4.17) 


where the Ady 


pendicular and A 


vectors are constructed to be mutually per- 
pA ig the paxzse Of b pamallel tomthiesvcc-or 
Ad, . Thus the coefficients Ay are determined from the 


condition 


a2 


IS je 0a = Ge) ae (4.18) 


K 
as 
OAD 
- (4.19) 
k ieee 
OA Ady, 
Substituting (4.19) into (4.17) yields 
de 
aj Ab aA b 
> aT, 1 ee Me 
a, A Ao a A Aa 
—l —L —r i 
4 oA *b 
= A y Co) + b (4ae203 
k=] oy A Ady, . a 


Comparison with (4.16) yields the desired solution, namely, 


ae. Tr 
x= 5 — — a . (4.21) 
k=1 Oy A Ady 


The mutually perpendicular vectors Ad, are constructed using 
the Gram-Schmidt process. Using (4.9) with On. replaced by 


Aa, and a, replaced by Aa,, yields 


Aq, = C1, Aa fe Aa, =0, recalceutere Ao, using Aa, > 
eet 
Cc = C i ae 1 = 1,2,...,m 
k  ~k-1 rT _ 
OA Ady, k = lo 2aeeeeo 
(4.22) 


Another equally acceptable set of mutually orthogonal 


base vectors for the column space of A may be obtained from 
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the vectors Ad, , where the d.'s are the rows of the matrix 
Nae Since the column space of A may be expressed using the 
set of -ailigene veccors Ad., as well as the set of all 
vectors Aa.. Using Ad. instead of Aa. in (4.22) requires 


less computation if the dimensions of A are such that 


m>>n. Thus it 1s possible to write, starting with Com Ow 
; AB, = C,_ Ad; oles AB, =0, 1 Sf@yel Lee anes AB, using Ad.) 
ee 
cee. 1 TeKEKA s,s mn 
k aw Tae de 
BLA AB, kee ; oe 


(4.23) 
where 8, replaces o, in (4.17) and (4.21). 
In order to obtain the solution (4.21), Eq. (4.23) is modi- 
fied to yield an explicit form for the calculation of the 
B'S, which when multiplied by A yield the orthogonal base 


vectors for the column space of A. 


Ale ic 
eee oy ee 


TT 5 a 
ByA AB) alia 2a 


Q 
Il 
LH 
| 


then 


AB 


and 


Compinang (42722) ~=wratten in recursive form, with (4.21) 
yields the desired finite step process for the solution of 


(4.1). Let Co = I then 


B. = C,_,d, if 8 = 0, recalculate 8, using ds, 


1 
g pata 
G. =ae - Bes ee 
eke TT 
BA AB, 
i 
TT 
ag aa Ey® i = 12 
ao Fee, a eee 
BLA AB, k = 1,2, Le 


(4.25) 


The process is completed when k = r. oe is the desired 


Semicon. 


C. MATRIX PSEUDO INVERSION 

The foregoing computation methods may be extended to 
yield matrix inversion or pseudo inversion. However no com- 
putation time comparison with already existing methods has 


been made. Consider the solution of the matrix equation 
AX = I (4.26) 


where the matrix A is square of dimensions n x n and non- 
Singular. The inverse of A may be obtained as the solution 


Of (4.26m «6 ~cerrvom€ «6(4.612) Se Lew Po =] and’ x* = ae then 


mG 


eee oe (1) 7 a, Xy_y) 
k° k-l—k 
(Ap e2.) 
T 
P a,a,P 
_ Pome rot _ 
= Pind + k ca LZ au 


c 
2 ee 


where ue 1s the i'th row of the identity matrix and the 

Hh lil “approxi Matron fonexX, x is the desired inverse. The 
tabeLal goOndl tiem Xy = Ne is chosen to ensure that the 
starting conditions for (4.12) are satisfied, since 

ans Azr «++, A, are, by virtue of the nonsingularity of A, 
not parallel to al: 
The pseudo inverse may be obtained as follows. 


Let C_ = 1 and X, = 0O, then 
O 0 


Oy = Cy, 49; PE 6,.=0, FreCcalelma lel oe USing qs. 


—1 k k ik 
T 
Onno 

Coe ec _ kk Ata 
k k-1 nn 

eects 

T 

6, 6 = 
X, re es a a al L = ue ar me 
fe S.A Ady = 1,2,...,k (4,28) 


where a; denotes the i'th row of Ata, and is the solu= 
tion for the pseudo inverse of A. Proof of (4.28) follows. 


From (4.28) it is evident that 


S,A°AS, = (0 Ody sas i (4.29) 


oe, 


Since the vectors AG. aly 2 eee) are orthogonal. Ahso 
ala May be expressed as a matrix whose rows are linear com- 


binations of the set of vectors O5- Thus 


if 
PA | - 





6] M (4.30) 


The expression for C.. from (4.28) may be rewritten as 


xe 6,6;A°A 
C =iIiI- 2 
: i=) Sones 
—L — 
(4481) 
= At: 
=I-| fr a ron 
1— leon Noe 
1 1 
where the summation is identified as Y 
F848; 
Y= - ae (4.323 
i=l §6.A Ad. 
—i —1 
so that 
al 
ee oa (4.333 
Using (4.29) it follows that 
C 6. = 0 (Ag a4y 
r—1 — 
and thus 
C_A’A =a (4.35) 
and CY = (0 (4.36a) 
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or Y = YA‘ay. (4. 36b) 


Using (4.36b) to expand the product atay yields 


Atay = a-aya‘ay 


oc 


or ALA — aA Gay =O. (An 37a) 


Since Y # 0 this yields 
ata = at‘ay ata (4.37b) 
From (4.35) 


ata = vA‘a ata = 0 


and then also [ata = vata ataly sa (ies 
This is combined with (4.37a) to yield 

[A "AY - YA‘A] A‘AY = 0. (4. 38a) 
Since Atay * 0, then 


Atay = yA‘a. ( 4b } 


Eqs. (4.36b), (4.37b) and (4.38b) by definition establish Y 


as the pseudo inverse of ata: 


vera A (4.39) 


Comparing (4.32) with the last equation in (4.28) when 


k = r yields 


ee SAA A ae (4.40) 


Penrose [5] has also suggested a recursive method for 


computing the pseudo-inverse. Let 

Ci = I 
then 

¢ iL Tt 7 a 

Chay = I- = trace (CLA A) CLA A (4.41) 
The product C e+] Aen = 0, where r is the rank of A Ae 
then the pseudo inverse of ata is 

ee ne (4.42) 


trace CY Ata 7 


The proof is given in Ref. 5. It should be noted that the 
Penrose method involves at least one matrix multiplication 
for each step, or approximately rn> operations, where n> 
operations are required to perform the multiplication of 
two n x n matrices, whereas the method of (4.28) requires 
approximately 5rn* Operations to obtain the same result, 


namely AA)’ . 
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V. RECURSIVE ALGORITHM FOR THE 


SLIDING-WINDOW OBSERVER 


Since the work of Luenberger [11], deterministic linear 
observation systems, called observers, have been recognized 
as practical alternatives to statistical optimum linear 
filters when efficient and fast real-time estimation of 
the system states is desired. Avoiding problems associated 
with the estimation of a priori statistics, the observer 
simply solves the estimation problem as a deterministic one 
and disregards statistical quantities. The simplest formu- 
lation for the observer is the sampled-data type which 
accepts the measurements or observations only at discrete 
points in time. 


Consider the sampled-data system 


lad 


CS) 


2 eel 


where xy is the system state vector at time instant k, 
rk] is the general time-varying transition matrix from 
meme (k-l)T to kT. M, is the time-varying observation 
matrix of dimension 1 xn, and Zy is the scalar observation. 
As in other chapters, only the case of scalar observations 
is considered here in order to obtain results without time 


consuming matrix inversions. The observer for the system 


(5.1) is given as 


ie 


ee ede a 2 1! Coley 


where x. is the estimate at time kT. 


k 


This observer equation may be rewritten as 


aA wn 
X, = FL xX 


Ak k-k-1 te i 2 (5.3a) 


where 
Fy = [I - Pie 1 (5 sab) 


The matrix FL is called the observer transition matrix. 
Bona [12] has shown that the eigenvalues of Fue which are 
dependent on the choice of Sue determine the performance 
of the observer in processing noi eeeeeeerinetad observa- 
tions. The time-varying gain, Dye for the optimal filter 
is determined such that the trace of the error covariance 
Matrix iS minimized. For a specific time-invariant 
observable system [3], Bona [12] has demonstrated that 
constant gain observer with eigenvalues of approximately 
0.5 approach the performance of the Kalman filter. As an 
observer design rule for time-invariant systems, he suggests 
the choice of the largest eigenvalue ht of iF isemehat 


(A re is approximately zero, with the result that (Fr) * is 


Ik 
approximately zero thereby limiting the memory of the 
observer to approximately the last & observations. Because 


of the size limitations of the memory, this type of observer 


is also referred to as a sliding-window observer. 
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In the following work, stimulated by a list of un- 
solved observer problems [13], a recursive relationship 
for the sliding-window of minimal length for the general 
system of Eq. (5.1) is developed in Section A. In Section 
B these results are extended for the time-invariant system 
to yield the recursive form for a sliding-window of speci- 
fied length. The approach to the solution is quite 
different from the one discussed by Bona, resulting in a 
new design rule for sliding-window observer of exact 
specified memory length. Finally, the results obtained are 


illustrated by an example. 


A. THE MINIMUM-WINDOW OBSERVER 

The minimum-window observer is the fastest linear 
observer possible in that it determines the states of a 
system from the necessary minimum number of observations. 
The eigenvalues of the observer transition matrix are all 
zero [12]. The desired recursive form is derived as follows. 
Consider system (5.1) and its observer equation (5.3). 
Suppose the system is of order n. At each instant of time 


im the State vector, X is determined from the last n 


aa 
Observations. Thus 
eee eT ere 
eee Mente? "k-nb oe 
: = . = (5. 4a) 
ae M10 ®k-1,k 
an i 


ae — 


LOS 


or 


Z = R(k) 


—k (Sep) 


a 


where the matrix R(k) relates the n last observations Zz, to 


k 
the state vector x,. The observability condition for this 
system is#then that Ri{k) is nonsingular for all k. Because 
of this it is not necessary that each of the rows of R(k) 


be the product of an observable pair. The solution to 


(5.4b) is then trivial 


a = IL 
x, = R(k) Z, (Senos) 
where X, denotes the output of the observer. Let the 
matrix Rt (k) = C(k) be partitioned into column vectors 
ek)? = Joy tm, —" —_ gn) : (5.6) 


Then (5.5) may be written as 


wn 


x 


Se c, (k)z 


kent] + So (K) Zp ingots es FE, (K) 2, ty, (K) 2, (5.7) 


Expanding Eq. (5.3) yields 


+ 


x eee) eee 


k 


ates Bey 


a Teese ees oa 


= PP ea Peng 1 2k-nt 7 PkPR-1* Pk-nt+29k-nt17k-nt1 


eel ken een Coren 


So ere 


+, Zy, (Seo) 
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Comparing (5.7) and (5.8) on a term by term basis leads to 


the following conclusions 


(1) PLP oye? Fy_ny7 = 0 (S98) 
(2) g. = Ck) (Se) 
(3) G, (k) = Fy Saag 52) Dim lee rastaeeey Mieel: (S26) 
sea that 
C ik) = Jeep) J++ fFyey Ok) 1c, 00 | : (5.9d) 
The recursive relation for the gain vector g, = ¢ (k) may 


be determined from (5.9a) using (5.9c) repeatedly. Thus 


Be LE 


‘ F, ¢,(k-1) = 0 (5.10) 


Red ent en nr 


Salier= 


Py ae. gM, | ee 
(5.10) may be rewritten as 


(k-1) = g,M, 6 c, (k-1) (Srlas) 


eee J ke 


The desired result is then obtained directly from (5.11) 


al: 
ail c., (x) = eee c, (k-1) [M, ae C, (k-1) j (5.12a) 


Note wthat for scalar observation Eq. (5.12) may be also 


written as 


Oo . c, (k-1) 
ee (5°12) 


Me 1k 21 KOT) 


tk 


Oe 


For scalar observation the desired algorithm is therefore 
iments mteelein) (5.95), (5.9c) and (5.3). This is 


summarized in (5.13): 


ee ee (k-1) 
aie aa 

ee key 
FP, = (1 - GM! Oak 


(Sais 
c (k) = Py Sa4y (ko), I) Seb seas lL; c,, (kK) =g, 


Xy, kote Ste 2 


This completes the derivation of the recursive algorithm for 
the minimum-window observer. The computation time required 
is almost equal to the computation time required for the 
corresponding optimal filter (3.38). For linear time- 
invariant systems with a time-invariant observation matrix, 
the observer gain and transition matrix are constant. 


Thus (5.13) reduces to the observer equation 


(5.14) 


and the required computation time is reduced substantially. 
Although the minimum-window observer is highly sensitive 

to measurement noise, practically excluding the direct use 
of (5.14) for non-deterministic problems, it is nevertheless 
possible to construct acceptable filter schemes from (5.13) 


or (5.14) for special applications. Consider, for example, 
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an oscillatory time-invariant system given by Eq. (5 30) ach 
zero damping. Then a simple averager following the minimum- 


window observer is characterized by 


Xx, PX, _1 ty G2) 
$-1: ig 21 ae (5.15) 
ee a =i “k “*k-1* k =k 


where Xs the output of the combined filter, results ina 
filter performance almost indistinguishable from the optimum 
Filter (3.38). This is demonstrated in Example 5.1. Note 
that the computational requirements are only a fraction of 
the computational requirements for the optimal filter. It 
is the author's opinion that this result merits further in- 
vestigation in order to find some design rules for simple 
and fast observers with almost optimal performance similar 


wer thnewene given in Eq. (5.15). 


Example 5.1 


Let the system equations (5.1) be given as 


(S72G)) 


Zz. = Mx iar O)x, + Vv, 


where the observations are now noise-contaminated with the 
measurement noise Ver a noise sample drawn from a uniform 
distributed noise population of zero mean with maximum 


deviation of +0.1. Estimate the system state vector xy 
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using (5.15) when the system starts with xo * : 
Results for a typical noise sequence LV, 1V5 -.-v,} are shown 
for both state variables x1) and x2) Mises. 5.) andmsa2 = 
Note that the filter response closely matches the actual 


system response. The estimation error defined by 
x = X,- X (oly) 


1s compared with the estimation error of the least-square- 


fi leer ner rgs. ower ana 5.4. 


B. SLIDING-WINDOW OBSERVER FOR TIME-INVARIANT SYSTEMS 

For the case of time-invariant observable systems with 
a time-invariant observation matrix it is possible to in- 
crease the memory of the observer to an arbitrary length & 
where 2 > n. This results in a sliding-window observer of 
length 2, where the state vector xX, is determined from the 
last 2 observations in the least-square-error sense. Con- 


sider the first set of 2 observations according to (5.1). 
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where Z) is the observation vector of the last & observa- 
ELOn, Xo is the n dimensional state vector at time instant 
2, and A is the 2 x n matrix of constants relating observa- 
tions and state vector Xp « Since the system is assumed to 


be observable matrix A has rank r =n. Then the solution 


to (5. 18 eisegiven by 


PAZ, (5.19) 


|x> 
i 


where Pp = tata} 2 


Eq. (5.21) may be written more explicitly as 


-1)TyT, Mia } 


foal R 


“&t2)T i 
he a 


+ (9 M +. . stal@ 


2 


(52205 
At time instant &+1l1 Eq. (S08) takes the form of He. (S2Z2i) 


Since the memory of the filter is limited to 2 observations. 


Seen | (5.2 


Themeanalogous to (5.20), .the solution of (9.21) may be 


written as 


A meres Lee 
Z 


_ -2+2,T. T Seer T 
Soe P{(o )M )M 


> +(e PM inizpe ean Zot cee 


e+] 


To obtain a recursive algorithm it is necessary to combine 
(5.22) aeemea (5.20). Thus 


= ee eel 


x - P(e ytuly = ae 


Spe! (te ee ee (On) me oe) 


1 2 
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—2+14 


SUS tee hea ie Zz, = M® Xor which sisMtnuem@iemineiselese 
observations, and premultiplying (5724) with p(o tytp7t 
yields 
P{(@° 7") TM z+... (07) Mz} 

iy p(o ty) Trp7t _ (oT 2t1) TyT yg 7241, S 

=e (oe 1) Sat aoe - (0° ") **Me"*} 0%, 

2 -9 2 = = 
ee 1) TF (6 mel) yo a £42) Ty lo R+2 a 
+ m’mjo? - (07%) Ty™me7*} OX, 
= P{(o)°**) tyme * 414... 407 Mo tox, 
= P{A’A - MM} 6x, 
at A 
= fe Seater em) oax, (5.24) 


mientifying g = pm- (5.24) may be written 


P{(o7***) MT z+...4(0°")TM'z,} = [1- gM] RX, (5.25) 


ims interesting to note that ques sqasvenpbyethe ast column 
of cae as a generalization of (5.9b) of the previous section. 
Substituting (5.25) into (5.22) yields the desired recursive 


relationship 


= [I - gM] Ox + (5.26) 


eS] eee an 


where the estimate x is no longer dependent on the ob- 


Q+1 


servation Z4- It then follows that the next estimate 


aS 


n~ 


ee ee go (5.27) 
1s no longer dependent upon the first two observations Z1 
and Zn Therefore the general recursive formulation for all 
discrete times k > & is given by 

X_, = Fx, + gz.) (5.2.8) 
where F = [I-gM]¢@ and g remain constant. The complete 
algorithm for the general rectangular sliding-window ob- 
server, with the first state estimation available after 
& observations are processed, is then 

xo = fale am C= 

(5 200) 

Sa idle al oe 
The estimation error 
Obeys the same dynamic relation as xe Thus 

Se) ek ESI 0 ae 
If the covariance matrix of X, is denoted by Cy it follows 
Prone (ons 4) erat 

_2 2 ™ T iD 

Chay = X47%,471 = FC, F + gRg (S932) 

where R is the variance of the measurement noise. Let 


Ee, = eer ir) ee (Venn) | 


nS 


where the sequence of matrices, Che is obtained trom (5232 )— 
R= 1, and C, (4,1) denotes the io diagonal element of 

the covariance matrix. Ey is a relative measure of the 
expected absolute estimation error at time instant k. A 
comparison in terms of this relative absolute estimation 
error for a few sliding-windows of the system in Example 5.1 


with the corresponding error of the least-square-error 
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VI. SUMMARY AND RECOMMENDATIONS FOR FURTHER STUDY 


(1) The normalized least-square-error solution of a set 

of inconsistent linear equatons has been established as 

an alternative to the usual least-square-error solution. 
The solution is obtained by minimizing a weighted least- 
Square-error criterion and presented in recursive form 

for obtaining sequentially, the solution of a growing set 
of equations. The technique is illustrated by some 
estimation and identification problems. Further investi- 
gation is required to establish a decision criterion to 
determine whether the normalized least-square-error or 

the least-square-error solution method is to be preferred 
in engineering problems. It is expected that this decision 
A: eevee in the case of problems involving discrete 

state estimation or parameter identification, will depend 
not only upon the system equations themselves but also upon 


the nature of the measurement noise. 


(2) Complete recursive algorithms for the normalized least- 
square-error solution and the least-square-error solution 
based upon the concept of the best approximate solution [5] 


are presented. 


(3) A finite step algorithm for the calculation of the 
pseudo inverse and the best approximate solution of a fixed 


set of linear equations is proposed. This result has 


mle 


advantage over previously published recursive computa- 
tion methods in that the algorithm presented starts 


unconditionally. 


(4) ‘Finite-memory, linear, observation filters in recur- 
sive form are proposed for the sequential state determin- 
ation of a sampled-data system. Design rules for these 
observation filters, when the system is time-invariant, 
are given: 

(a) for the minimum-window and averager observer where 
the system states as determined from the minimum set of 
past data, are smoothed by a simple averager. This 
procedure is shown to be very efficient for an oscillatory 
system with zero damping. Further investigation is recom- 
mended to improve the estimation for general systems, using 
the minimum-window observer together with a weighted 
averager. It is expected that optimal weighting can be 
approached using a scalar weighting factor in the recursive 
£orm. 

(b) for the sliding-winding observer of memory length 
g, & > n, where the n states of the system are determined 


from the last & measurements in the least-Square-error sense. 


(5) In the Appendix, the recursive algorithm of the best 
approximate solution is applied to the numerical solution 
of a set of nonlinear equations. The results are promising 
in that solutions are obtained when other well known linear- 


ization or gradient techniques fail. The theory behind 
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this approach has not been investigated fully and it is 
recommended that further mathematical work be persued to 


establish rigorous proofs and conditions of convergence. 


eek 


APPENDIX A 


A. ITERATIVE SOLUTION OF A SET OF NONLINEAR EQUATIONS 

The solution of simultaneous nonlinear equations is 
often impossible to obtain analytically, and graphical or 
iterative methods for a computer solution have to be em- 
ployed. In addition, the solution of a class of nonlinear 
differential equations, as discussed in Section B, reduces, 
after integration, to the solution of nonlinear algebraic 
equations at each step of the integration process [16]. 
The most commonly used iterative methods are based upon 
linearization techniques, i.e., Newton-Raphson method and 
linear interpolation, or upon some gradient method whereby 
the iterative approximation to the solution is sequentially 
improved such that some error measure is forced to decrease 
[16]. All these methods however may not converge, or they 
may converge to a solution at infinity. In addition, the 
values of the functions as well as their gradients may 
have to be calculated at each iterative approximation. 
This calculation might be impossible if the approximation 
is outside the range or domain of one or more of the 
functions, or if one or more of the functions has a dis- 
continuity at that particular approximation point for the 
solution. Further complications arise from the fact that 
the set of nonlinear equations may have more than one 
solution and the above iteration schemes may converge (if 


they converge at all) to an undeSirable solution point. 
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Using the solution methods for a set of linear equa- 
tions, as discussed in Chapter IV, a new iterative pro- 
cedure is developed. This procedure circumvents the problems 
and drawbacks of the foregoing methods, and converges to a 
Single point if one or more solution points exist. If this 
point is no solution point the initial estimate has to be 
displaced in the direction of the preferred solution point, 
where the preferred or desired solution is defined (in 
accordance with the concept of the minimum-norm solution) 
as the solution which is closest to the initial estimate. 

The class of nonlinear functions included in the iteration 
process in general are all functions which can be represen- 
ted in polynomial form or which have power-series expansions. 

1. Development of the Method 

Let a set of n nonlinear functions inn unknowns be 


given as 
hy (x) 


Alco) =e = 0 (A.1) 


Using the polynomial form, or a power-series representation, 


(A.1) may be written as 


Ax + Bg(x) = c (A.2) 


where A and B are constant matrices of dimensions n xn and 


n X m respectively. The elements of the n x l vector 
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c= [cy «ae e a are the negatives of the constant terms 


in the polynomial or series form. The vector 
g (x) = [Fy (en. (X) ]” has dimensions mxl. Its elements, 
the functions g, (X) are the nonlinear remainder terms of the 
Corresponding Eynetion h(x), Or parts thereof, chosen such 
that the functions g. (x) are defined for all x. 
Range-or-domain-limited functions g. (x) can be accepted 
only if it is possible to reformulate these functions in 
terms of variables for which they are always defined. As 


an example consider the equation 
y = &£nx = 0 | (A. 3a) 


which has no real solution for x 0. However the same 


equation may be written as 


ey - x = 0 (A. 3b) 
Ot 
Z 3 
{l +-y + Hi + as + ...} - x = 0 (A. 3e) 
Or 
[-i + 1] |x] + [1] J, ty) + en (A. 3d) 
My 
where 
2 3 
So eee a _— (A.3e) 


Now g, ly) is defined for all y and the domain-limited Eq. 


(A.3a) is acceptable in the iteration process in the form 


124 


of Eq. (A.3d). This separation of the nonlinear part of h; 
from the linear and constant part has the advantage that 
the discontinuities of h, do not appear in g.. TWAS geet @ i 


example, if 


2yx+x-12=0 (A. 4a) 
or 
[1 mat jemi) Sees (A. 4b) 
y 
y is not defined for x = 0. However the function 
G5(x,y) = x.y (A.4c) 


is defined for all x and y. 


Let 
x =x + Ax (7S) 
—_ —O ——= 


Whore “xeais thewdesrred solution xX is the initial estimate 


for x, and Ax is the necessary correction to Xo im” order tO 


satisfy Eq. (A.1). Eq. (A.2) may be rewritten as 

AAXx + Big (x +Ax) -g(x,) J =c- AX, = Bg (x,) (A. 6a) 
Ore 

iA + BD (Xs Ax)] Ax = ct (A.6b) 


where c' is a vector of constants defined as the right side 
of Eq. (A.6a) and the elements of the m x m matrix D are 


defined from the total difference quotients of the functions 


eS 


[D(x ,Ax)] Ax = g(x, + Ax) - g(x) (A.6c) 


This may be best explained with an example. Again consider 


Ear (&-4¢c). Then 


Go (KX, AG aN act A.) = Go(Xoy Me.) 
} 


(x 4. Ax) (y, + Ay) - XY 


O 


x Ay ne iaixAy (A. 7a) 


which may be written as 


A AX 
[y+ x +5 | (A. 7b) 


Then the terms ve 3° a2) and (xX, + “¥) are elements of the 
matrix D. In general all the elements in the matrix D 

are dependent on the initial estimate, which is a constant 
vector dunzing thes»iteration»process,.and the value) ofmAxy, 
which is unknown. The algorithm proposed sequentially 
updates an estimate for the elements of D(x, , Ax) until the 
true values are obtained. Then the last approximation for 
Axes ene desired correction for the solution given in 
(A.5). From (A.1), and the given initial estimate Xo 


calculate A, B and c" as defined above. Legis denote 


k 
the ta approximation for Ax and set Ax” = 0. Then solve 
iteratively. 
{A + BD (Xo, Ax, ) ] AX, 41 Se (A.8) 
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USING e425) tor AX, 44 until (A.5) satisfies all equations 
in (A.1). The solution is obtained from (4.25) because the 
matrix in (A.8) may become singular at some step in the 
iteration process. This describes the basic technique, 
whereby the solution is obtianed by iterative approximation 
of the exact difference quotients of the functions g, (x) 
and not by iterative improvements of a previous approxi- 


mation to the solution. If the process (A.8) converges and 


the error 


n 
e = en. | (A.9) 


1S acceptably small, the solution (A.5) usually is the 
desired solution closest to the initial estimate. However, 
meeprocess (A.8) converges tO@aeyame tere or witkchae jc 
not small, then x = Xo + Ax liese between Sewolcimsnene 
solutions of (A.1). In this case a displacement of the 
initial estimate is necessary and the iteration has to be 
repeated. As shown in subsequent examples the values for 
the elements of AX, oscillate in a damped manner about 
their exact value. In order to increase the rate of con- 
vergence and, more importantly, to force convergence if the 
damping of these oscillations is slightly negative (which 
would eventually lead to divergence) additional damping may 
be introduced. This is accomplished by using a weighted 
avevage Beevwecti=r ae 


Ke. 


elements of D(x, ,Ax,). 


with Ax. = 0 using (4.25) 
a 


and AX, for the computation of the 


Then solve iteratively starting 


T2 


[A + BD (xo, Ax, | AX.41 = € 


(A.10) 


“a” “aw 
Ax = Ox. 


=k+1 7. ae 


k Kel: 


where a, 0 < @<1, 18 the coefficient of additional damping. 
9 is determined from the rate of convergence of the process 
(AW 10) starting with a= 0. If after a few steps of the 
iteration are completed the convergence rate is considered 
too small, a may be increased and the process reinitiated. 
This completes the development and discussion of the new 
algorithm for the solution of a set of nonlinear equations. 
In order to show the power of the iteration method (A.10) 
two examples, where other techniques may fail, are given 
below. 
Zoe Expemimentad Resuilies 
mae i 


Find the point of itersection of the two curves 


vy Go eny Tox = 


2 (A.11) 


vey xk 


closest to the,given point (Xo rY,)> 
According to (A.6a) and (A.6b) this set of equations may be 


written as 


fk d * + we ‘ a 2 (A.12a) 


is 


OG 


| oy AX | | 
[1 1 20) ‘y+ xt || [ax 
| ~ | 
}o 1 a 1 [2x +dx 0 , asa 
=| 1 - ie Xo 7 2 0 XOYo 
i 0 1 y 0 -1 x2 
O O 
eZ) 
Eq. (A.12b) is now in the desired form for (A.10). 
Starting from the initial estimates 
oe 0 <lel eae x | 8 
a) = b) = Cc) a 
Yo 0 ’ Yo ite Yo 0 


with no additional damping (A.10) yields the following 


sequences for Ax 


k 
a0 23 SG = Sets 
a) pious 
Saini, lakave: | ee onee: i768 
m5 3975 3993 
b) | ee 
22 el A a ere ; , 2.985 
c) | .883 merle .5993 
=o ialaae bib 2 P|. aise 


Thus the following solutions are obtained 
Px] =.15955 
a) = | 
_y aT O3 | 


eo 


X eS oo 


b) — 
Y area 
x 1899 
c) = 
4 =. 1 OS 


The graphs of the functions in (A.11) and the three solu- 
tions are shown in Fig. A.l. Fig. A.2 shows the oscilla- 
tion Of the sequeptialavalues Ax for case (a). Pigs saad 
illustrates the dependence of the rate of convergence upon 
the additional damping a where it is assumed that the 
solution is obtained whenever e€ < in) 

“ Note that in case (b) none of the iteration methods that 
depend upon function values could have initiated an iteration 
and that in case (c) other methods would have converged to 
a different solution point. 

Example 2 


Find the point of intersection of the two curves 


1 


ly + 2xy + x 
(A.13a) 


2a + xX = 


closest to the point of (x +¥,) 

xXx*} ff = 2/7 
The solution y*| = 143 , obtained directly from sub- 
stitution is the only finite solution. Thus, no matter 


what the initial estimate is, the desi#red solution is 


[xe yx}. The set of Eq. (A.13) may be rewritten as 


T3160 
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(A.13b) 


Similar to other iteration methods, (A.9) will diverge 
whenever the initial estimate Se 0. While the other 
methods diverge because they iterate towards the solution 
H = |-75 method (A.9) diverges because the oscillations 
in Ax become increasingly larger. This situation is reme- 
died by using the method of Eq. (A.10) with an additional 
damping factor of approximately a= .7. Now the iteration 
for Ax converges rapidly so that the desired solution is 
obtained in only 8 to 9 iteration steps. The solution 

for the first few iteration steps, starting with the 
initial estimate Fo) -(3] as well as the graphs of the 


Yo 
functions defined by (A.13a), are shown in Fig. A.4. 


B. SOLUTION OF THE DYNAMIC RESPONSE OF CIRCUITS CONTAINING 

NON-LINEAR RESISTIVE ELEMENTS* 

The exact solution of the dynamic response of a circuit 
containing non-linear resistive elements such as diodes or 
transistors often presents problems, even when the non-linear 
characteristics are known, because it may not be possible 
to solve for the required variables explicitly [6]. The 


* 

The material in this section up to Eq. (A.21) has been 
published in the Proceedings of the Second Annual Princeton 
Conference on Information Sciences and Systems, 1968. 


134 


ILLUSTRATION TO EXAMPLE 2 
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non-linear resistive elements are considered as dependent 
current (or voltage) sources, dependent upon state or 
Other variables in the network. The network is then 


characterized by the following equations. 


X = Ax + Byu + Boi, (A.14) 

1 em N £(v_) , (Ao. isp 

a = Re ot 5) we or So a, (A.16) 
where x = the state variables 

u = independent sources 

—s 

i. = equivalent current sources for the non-linear 

elements 
Ver a corresponding voltages across the non-linear 


elements 


The matrices A, Bis Bo, R, Sy) and S5 are determined from 
the network and matrix N is defined from the characteristics 
of the nonlinear network components. The roles of V., and 

i, in (A.14), (A.15), and (A.16) are interchanged when 
dependent voltage sources are used. Eq. (A.14) is the state 
equation for the linear part of the circuit. Eq. (A.15) 
formulates the characteristics of the non-linear components 
as given, for example, by the Ebers and Moll [20] equations 
for transistors. Eq. (A.16) gives the circuit constraints 
(loop equations for equivalent current sources and nodal 


equations for equivalent voltage sources) between the 
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voltage across the nonlinear elements and their currents, 
the states, and the Known sources. 
For a discrete time solution the sources, u, ein Cl aie, 


may be approximated as piecewise linear functions. Thus 


the solution to (A.14) is given by 


oy Feehetl Byes) MgB wetktl) +1 Boi (Ome st (ctl) 
(A.17) 
where 
ee (A.18) 
i -l. ar at. ar | 
yy fe —— ( - I) ] (A.19) 
i T 
= 
a ay? (A (a a) ae) (A.20) 


Substituting (A.15) and (A.17) in (A.16) yields 
ake 1) = R x (SRP 1 BU, (K) +B, (kK) ] 


+ (S,+R 9B, )u, (k+1) + (Sj+RT5B,)N flv (k+l) ] 
(A.21) 
Knowing the values x(k), u(k+1), 1, (k), Ea. (Aw2 6) 
represents n simultaneous non-linear equations which have 
to be solved for v,, (kt1l). Eq. (A.21) may be written in the 


ce rm 


A'v (k+l) +B 
——1 


(Av22) 





oe 


For diodes or transistors the functions g. (v_) are function 


of one variable only and have the general form 


Vay 
g; (v;) = (e - 1) (A.23) 
Thus the method (A.10) is directly applicable and yields 
with the estimate Vv, (K) the solution closest to this point 
v,(kK+1). ‘Using v_(k+1) in (A.15) and (A.16) enables i, (k+1) 


to be calculated. Using these as initial values in (A.21) 


permits the calculation cycle to be reiterated. 
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